[WIP] PFT rewrite-based do-concurrent parallelization #230

skatrak · 2024-12-12T17:12:50Z

This is a proof of concept on a PFT rewrite-based approach to do OpenMP-based parallelization of do concurrent Fotran loops. The main advantage of this approach over an MLIR pass-based one is that it should allow us to avoid re-implementing and sharing significant pieces of PFT to MLIR lowering between Flang lowering and the MLIR pass, potentially also making it much simpler to keep feature parity.

The current WIP replicates the PFT structure of an !$omp parallel do when encountering a do concurrent loop. It is still in very early stages and the resulting PFT cannot be lowered to MLIR yet, as it seems to be missing some symbol updates. However, it can already be tested:

! test.f90
subroutine foo()
  implicit none
  integer :: i

  do concurrent(i=1:10)
  end do

  !$omp parallel do
  do i=1,10
  end do
end subroutine

$ flang-new -fc1 -fdebug-unparse -fopenmp test.f90
SUBROUTINE foo
 IMPLICIT NONE
 INTEGER i
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
END SUBROUTINE

$ flang-new -fc1 -fdebug-dump-parse-tree -fopenmp test.f90
Program -> ProgramUnit -> SubroutineSubprogram
| SubroutineStmt
| | Name = 'foo'
| SpecificationPart
| | ImplicitPart -> ImplicitPartStmt -> ImplicitStmt ->
| | DeclarationConstruct -> SpecificationConstruct -> TypeDeclarationStmt
| | | DeclarationTypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec ->
| | | EntityDecl
| | | | Name = 'i'
| ExecutionPart -> Block
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| EndSubroutineStmt ->

This is a proof of concept on a PFT rewrite-based approach to do OpenMP-based parallelization of `do concurrent` Fotran loops. The main advantage of this approach over an MLIR pass-based one is that it should allow us to avoid re-implementing and sharing significant pieces of PFT to MLIR lowering between Flang lowering and the MLIR pass. The current WIP replicates the PFT structure of an `!$omp parallel do` when encountering a `do concurrent` loop. It is still in very early stages and the resulting PFT cannot be lowered to MLIR yet, as it seems to be missing some symbol updates. However, it can already be tested: ```sh $ cat test.f90 subroutine foo() implicit none integer :: i do concurrent(i=1:10) end do !$omp parallel do do i=1,10 end do end subroutine $ flang-new -fc1 -fdebug-unparse -fopenmp test.f90 SUBROUTINE foo IMPLICIT NONE INTEGER i !$OMP PARALLEL DO DO i=1_4,10_4 END DO !$OMP PARALLEL DO DO i=1_4,10_4 END DO END SUBROUTINE $ flang-new -fc1 -fdebug-dump-parse-tree -fopenmp test.f90 Program -> ProgramUnit -> SubroutineSubprogram | SubroutineStmt | | Name = 'foo' | SpecificationPart | | ImplicitPart -> ImplicitPartStmt -> ImplicitStmt -> | | DeclarationConstruct -> SpecificationConstruct -> TypeDeclarationStmt | | | DeclarationTypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> | | | EntityDecl | | | | Name = 'i' | ExecutionPart -> Block | | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct | | | OmpBeginLoopDirective | | | | OmpLoopDirective -> llvm::omp::Directive = parallel do | | | | OmpClauseList -> | | | DoConstruct | | | | NonLabelDoStmt | | | | | LoopControl -> LoopBounds | | | | | | Scalar -> Name = 'i' | | | | | | Scalar -> Expr = '1_4' | | | | | | | LiteralConstant -> IntLiteralConstant = '1' | | | | | | Scalar -> Expr = '10_4' | | | | | | | LiteralConstant -> IntLiteralConstant = '10' | | | | Block | | | | EndDoStmt -> | | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct | | | OmpBeginLoopDirective | | | | OmpLoopDirective -> llvm::omp::Directive = parallel do | | | | OmpClauseList -> | | | DoConstruct | | | | NonLabelDoStmt | | | | | LoopControl -> LoopBounds | | | | | | Scalar -> Name = 'i' | | | | | | Scalar -> Expr = '1_4' | | | | | | | LiteralConstant -> IntLiteralConstant = '1' | | | | | | Scalar -> Expr = '10_4' | | | | | | | LiteralConstant -> IntLiteralConstant = '10' | | | | Block | | | | EndDoStmt -> | EndSubroutineStmt -> ```

skatrak requested review from ergawy and kparzysz December 12, 2024 17:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] PFT rewrite-based do-concurrent parallelization #230

[WIP] PFT rewrite-based do-concurrent parallelization #230

skatrak commented Dec 12, 2024

[WIP] PFT rewrite-based do-concurrent parallelization #230

Are you sure you want to change the base?

[WIP] PFT rewrite-based do-concurrent parallelization #230

Conversation

skatrak commented Dec 12, 2024