Running Large Sparse Problems on Shared Memory Computers

Media type: E-Book

Title: Running Large Sparse Problems on Shared Memory Computers

Contributor: Wasniewski, Jerzy [Author]; Zlatev, Zahari [Other]

imprint: [S.l.]: SSRN, [2018]

Extent: 1 Online-Ressource (12 p)

Language: English

Origination:

Footnote: Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments January 2003 erstellt

Description: It is hard to achieve high efficiency when large sparse problems are solved on parallel computers. There are three major reasons for this:• one must use indirect addressing in all sparse matrix techniques,• the data involved in computations are irregularly distributed, which means that there arise cache problems,and• it is difficult to find parallel tasks, and when such tasks are found they are normally small (this means that the time for starting the parallel execution is rather considerable compared with the actual time for performing the parallel computations).The computations can be organized as successive factorizations of dense rectangular blocks. Subroutines for performing QR factorization from the well-known LAPACK Library can be applied during the factorization of any dense block. The three disadvantages that are listed above are completely removed during this part of the computations.Two additional tasks are to be carried out: (i) to prepare the dense blocks and to put the involved data in a buffer, where the dense computations are to be carried out and (ii) to restore the data in the sparse matrix arrays when the dense matrix computations for the current dense block are finished. It will be shown that the dense matrix computations are dominating. The extra benefit of using dense blocks is a consequence of the fact that the computations within the dense blocks can easily be carried out in parallel. Numerical results, which illustrate this, will be given. The computations have been performed on SUN shared memory computers.The algorithm based on using dense blocks is derived from an algorithm, which has been developed for vector processors. Some changes were needed to make this algorithm efficient for computers with shared memory. These changes will be explained

Access State: Open Access

Search in field:

Recently searched for: