Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go parallel #14

Open
thorade opened this issue Jan 27, 2013 · 4 comments
Open

go parallel #14

thorade opened this issue Jan 27, 2013 · 4 comments

Comments

@thorade
Copy link
Owner

thorade commented Jan 27, 2013

I don't see how to do this in Modelica, but going parallel could significantly speed up things:
All 5-14 HelmholtzDerivs can be calculated simultaneously, see HelmholtzDerivs and setHelmholtzDerivs
and all 12 up to 50 terms of each HelmholtzDeriv could be evaluated simultaneously, see e.g. f_r

Combining these two should give 60 or more independent threads.
This should be investigated in combination with (automatic?) common subexpression elimination, because the f_r etc function calls have very many terms in common! Maybe combine with #26 ?

@thorade thorade closed this as completed Nov 30, 2015
@thorade thorade reopened this Sep 9, 2020
@thorade
Copy link
Owner Author

thorade commented Sep 9, 2020

There are many other places where two or more functions could be called in parallel:

@casella
Copy link

casella commented Sep 9, 2020

@mahge, do you think we can exploit such a fine-grained parallelism? I'm afraid the overhead could kill any potential speedup.

@mahge
Copy link

mahge commented Sep 9, 2020

I can not say much without looking at it further. However, the design and implementation is intended to be used for fine-grained parallelism, i.e., at equation level instead of just strongly connected components. Unfortunately, it will not go down into functions and parallelize things there yet.

The good news is that, if these large functions (computations) are attached (called from) equations that can be computed independent from each other in a single time step, it should be parallelizable. In other words, consider each instance of the call to these functions from different equations as part of that equation's computation. If, after causalization, one of the assignments does not use the LHS of the other equation, it is all the same for the implementation and we should be able to run them in parallel.

As for the sum() operators and similar data-parallel computations within functions/algorithms there is another parallelization implementation I did a while back that can handle them even on GPUs. However, this will require modifications to the library source code making it unusable on other Modelica tools. Plus the arrays/computations need to be quite large (by Modelica standards) to see any speedup. We can look at that afterwards if you are interested.

@casella
Copy link

casella commented Sep 9, 2020

I can not say much without looking at it further. However, the design and implementation is intended to be used for fine-grained parallelism, i.e., at equation level instead of just strongly connected components.

OK.

Unfortunately, it will not go down into functions and parallelize things there yet.

I guess this issue could be solved by clever generation of auxiliary variables. We already have some kind of Common Subexpression Elimination on functions carried out by wrapFunctionCalls, which generates auxiliary equations $cseNN = f(...); for each function call in the model, and use $cseNN in place of that in the functions inside equations. Maybe this could be good enough to get separate function calls in parallel.

The good news is that, if these large functions (computations) are attached (called from) equations that can be computed independent from each other in a single time step, it should be parallelizable.

Yes, that is the point.

In other words, consider each instance of the call to these functions from different equations as part of that equation's computation. If, after causalization, one of the assignments does not use the LHS of the other equation, it is all the same for the implementation and we should be able to run them in parallel.

As for the sum() operators and similar data-parallel computations within functions/algorithms there is another parallelization implementation I did a while back that can handle them even on GPUs. However, this will require modifications to the library source code making it unusable on other Modelica tools. Plus the arrays/computations need to be quite large (by Modelica standards) to see any speedup. We can look at that afterwards if you are interested.

Yeah, I guess the size of arrays is not so large that we can benefit from that. After all, a double-precision summation is one clock cycle on modern CPUs (or even less for superscalar architectures), so if you need to sum a few dozen numbers going paralle probably doesn't make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants