A Parallel 4SID Algorithm
Abstract
In this paper present a parallel implementation, based on ScaLAPACK, of
a direct 4SID method. As the computational stages of the method are solved mainly
by direct calls to ScaLAPACK routines, we concentrate on the specific difficulties of
the implementation, e.g. the redistribution of data between some stages. A structured
matrix multiplication is implemented efficiently by a dedicated algorithm. We report
experimental results that show good behavior for up to 16 processors.
a direct 4SID method. As the computational stages of the method are solved mainly
by direct calls to ScaLAPACK routines, we concentrate on the specific difficulties of
the implementation, e.g. the redistribution of data between some stages. A structured
matrix multiplication is implemented efficiently by a dedicated algorithm. We report
experimental results that show good behavior for up to 16 processors.