Skip to content

suyashbakshi/LU-Decomposition-Blocked

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

LU-Decomposition-Blocked

A barebone implementation of blocked LU decomposition of a matrix. You can use it as a starting point for openMP, MPI or any other parallelization framework. Feel free to use/modify/distribute the code. A star wouldn't go amiss if this implementation proved helpful/saved some of your valuable time. :)

The code structure of "LU_Decomposition()" is taken from NERSC website example here: https://www.nersc.gov/users/software/programming-models/openmp/openmp-tasking/openmp-tasking-example-lu/ (Please refer to NERSC website for any citation that might be required)

Implementation of row_func(), col_func(), inner_func() and diag_func() was done by me (Suyash Bakshi).

References from NERSC website:

Starting with the serial version of LU, shown below, we can see that there is a division with each diagonal element on every element in the rest of the row, and then the changes in this column are then propagated to each row after.

for(int i=0; i<size; i++) {
    for(int j=i+1; j<size; j++) {
        A[j*size+i] /= A[i*size+i];
        for(int k=i+1; k<size; k++) {
            A[j*size+k] -= A[j*size+i] * A[i*size+k];
        }
    }
}

The code LU_decomp_blocked.c implements the following idea: This can be broken into 4 functions and applied to a blocked version of the matrix. The first is the same as the original algorithm above, and it's applied to the diagonal blocks. Then the changes from the diagonal block are propegated to the blocks in the same column and the blocks in the same row. A fourth function propegates these changes to the rest of the remaining matrix, and then this is repeated on that remaining inner matrix. Breaking the matrix into blocks not only enables parallelism, but also improves cache usage.

for(int i=0; i<num_blocks; i++) {
    diag_func( block_list[i][i] );
    for(int j=i+1; j<num_blocks; j++) {
        row_func( block_list[i][j], block_list[i][i] );
    }
    for(int j=i+1; j<num_blocks; j++) {
        col_func( block_list[j][i], block_list[i][i] );
        for(int k=i+1; k<num_blocks; k++) {
            inner_func( block_list[j][k], block_list[i][k], 
                                          block_list[j][i] );
        }
    }
}

About

Implementation of Blocked LU decomposition of a matrix

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages