Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] PFT rewrite-based do-concurrent parallelization #230

Draft
wants to merge 1 commit into
base: amd-trunk-dev
Choose a base branch
from

Conversation

skatrak
Copy link

@skatrak skatrak commented Dec 12, 2024

This is a proof of concept on a PFT rewrite-based approach to do OpenMP-based parallelization of do concurrent Fotran loops. The main advantage of this approach over an MLIR pass-based one is that it should allow us to avoid re-implementing and sharing significant pieces of PFT to MLIR lowering between Flang lowering and the MLIR pass, potentially also making it much simpler to keep feature parity.

The current WIP replicates the PFT structure of an !$omp parallel do when encountering a do concurrent loop. It is still in very early stages and the resulting PFT cannot be lowered to MLIR yet, as it seems to be missing some symbol updates. However, it can already be tested:

! test.f90
subroutine foo()
  implicit none
  integer :: i

  do concurrent(i=1:10)
  end do

  !$omp parallel do
  do i=1,10
  end do
end subroutine
$ flang-new -fc1 -fdebug-unparse -fopenmp test.f90
SUBROUTINE foo
 IMPLICIT NONE
 INTEGER i
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
END SUBROUTINE
$ flang-new -fc1 -fdebug-dump-parse-tree -fopenmp test.f90
Program -> ProgramUnit -> SubroutineSubprogram
| SubroutineStmt
| | Name = 'foo'
| SpecificationPart
| | ImplicitPart -> ImplicitPartStmt -> ImplicitStmt ->
| | DeclarationConstruct -> SpecificationConstruct -> TypeDeclarationStmt
| | | DeclarationTypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec ->
| | | EntityDecl
| | | | Name = 'i'
| ExecutionPart -> Block
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| EndSubroutineStmt ->

This is a proof of concept on a PFT rewrite-based approach to do OpenMP-based
parallelization of `do concurrent` Fotran loops. The main advantage of this
approach over an MLIR pass-based one is that it should allow us to avoid
re-implementing and sharing significant pieces of PFT to MLIR lowering between
Flang lowering and the MLIR pass.

The current WIP replicates the PFT structure of an `!$omp parallel do` when
encountering a `do concurrent` loop. It is still in very early stages and the
resulting PFT cannot be lowered to MLIR yet, as it seems to be missing some
symbol updates. However, it can already be tested:

```sh
$ cat test.f90
subroutine foo()
  implicit none
  integer :: i

  do concurrent(i=1:10)
  end do

  !$omp parallel do
  do i=1,10
  end do
end subroutine

$ flang-new -fc1 -fdebug-unparse -fopenmp test.f90
SUBROUTINE foo
 IMPLICIT NONE
 INTEGER i
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
!$OMP PARALLEL DO
 DO i=1_4,10_4
 END DO
END SUBROUTINE

$ flang-new -fc1 -fdebug-dump-parse-tree -fopenmp test.f90
Program -> ProgramUnit -> SubroutineSubprogram
| SubroutineStmt
| | Name = 'foo'
| SpecificationPart
| | ImplicitPart -> ImplicitPartStmt -> ImplicitStmt ->
| | DeclarationConstruct -> SpecificationConstruct -> TypeDeclarationStmt
| | | DeclarationTypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec ->
| | | EntityDecl
| | | | Name = 'i'
| ExecutionPart -> Block
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| | ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct
| | | OmpBeginLoopDirective
| | | | OmpLoopDirective -> llvm::omp::Directive = parallel do
| | | | OmpClauseList ->
| | | DoConstruct
| | | | NonLabelDoStmt
| | | | | LoopControl -> LoopBounds
| | | | | | Scalar -> Name = 'i'
| | | | | | Scalar -> Expr = '1_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '1'
| | | | | | Scalar -> Expr = '10_4'
| | | | | | | LiteralConstant -> IntLiteralConstant = '10'
| | | | Block
| | | | EndDoStmt ->
| EndSubroutineStmt ->
```
@skatrak skatrak requested review from ergawy and kparzysz December 12, 2024 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant