Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snippets] Added type support to LoopPort #28310

Merged
merged 6 commits into from
Jan 10, 2025

Conversation

a-sidorova
Copy link
Contributor

@a-sidorova a-sidorova commented Jan 8, 2025

Details:

  • Moved fields of the class LoopPort to private section and added getters/setters
  • Implemented Type to LoopPort to distinguish not incremented ports due to double ptr increment and not incremented ports of Brgemm.
  • These changes fix inefficient calculation of Buffer allocation size in dynamic MHA Subgraphs. More details are described in the ticket 159913

Tickets:

  • 157326
  • 159913

@a-sidorova a-sidorova added the category: CPU OpenVINO CPU plugin label Jan 8, 2025
@a-sidorova a-sidorova added this to the 2025.0 milestone Jan 8, 2025
@a-sidorova a-sidorova requested review from a team as code owners January 8, 2025 11:24
Copy link
Contributor

@v-Golubev v-Golubev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments from my side, LGTM 👍

src/common/snippets/include/snippets/lowered/loop_port.hpp Outdated Show resolved Hide resolved
src/common/snippets/include/snippets/lowered/loop_port.hpp Outdated Show resolved Hide resolved
src/common/snippets/src/lowered/pass/insert_buffers.cpp Outdated Show resolved Hide resolved
@@ -56,42 +56,40 @@ void propagate_updated_subtensor_through_loop(const LinearIR& linear_ir,
// First step: set new dim value to the corresponding input_ports' dimensions
if (most_outer_loop) {
for (const auto& port : loop_info->get_input_ports()) {
const auto& reg_type = port.expr_port->get_descriptor_ptr()->get_reg().type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that previously we used undefined reg_type as a marker that loop port is NotProcessed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that this is not so..

Because previously we tried to find Processed ports: Incremented or NotIncremented. But gpr could be incremented and not_incremented and not_processed.. Now we use port.get_type() != LoopPort::Type::NotProcessed to explicitly find needed ports for propagation.

Comment on lines 29 to 39
template<LoopPort::Type T = Type::Incremented,
typename std::enable_if<T == Type::Incremented || T == Type::NotIncremented, bool>::type = true>
static LoopPort create(const ExpressionPort& port, size_t dim_idx = 0) {
return LoopPort(port, dim_idx, T);
}

template<LoopPort::Type T,
typename std::enable_if<T == Type::NotProcessed, bool>::type = true>
static LoopPort create(const ExpressionPort& port) {
return LoopPort(port, UNDEFINED_DIM_IDX, Type::NotProcessed);
}
Copy link
Contributor

@IvanNovoselov IvanNovoselov Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's cool that we try to use template builders here 👍
However, in this particular case this does not look necessary since we have to explicitly specify the type anyway. It would probably be easier just to use good-old 3 args constructor here and add an assert that Incremented/NotIncremented provide non-default argument.
LoopPort(const ExpressionPort& port, dim_idx = 0, LoopPort::Type = Type::Incremented);

I get it that the present approach doesn't even allow to specify an invalid dim_idx, but it just seems to me that there is an easier way to validate dim_idx, and we already use runtime validation in set_dim_idx anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Firstly, I tried to use SFINAE-principle as we do in other code parts in Snippets (I learnt it 😄 ). It helps us to avoid invalid dim_idx in case of NotProcessed, as you noticed - this is right.
Secondly, this behavior resolves my big problem which I got during this work.
If we have only one constructor: LoopPort(const ExpressionPort& port, dim_idx = 0, LoopPort::Type = Type::Incremented) and I want to create NotProcessed port, I should always call LoopPort(expr_port, UNDEFINED_DIM_IDX, Type::NotProcessed): user should always pass UNDEFINED_DIM_IDX. And if he pass another value, he will get runtime error (not compilation as in SFINAE).
If we have only one constructor: LoopPort(const ExpressionPort& port, LoopPort::Type = Type::Incremented, dim_idx = 0): user can pass invalid dim_idx again for NotProcessed. Also if he want create loop port which processed dim_idx, he should call LoopPort(const ExpressionPort& port, Type::Incremented, 1) - it's so long (for me at least)! I just want to create default loop port for dim_idx and I have to pass default type of loop port.

I believe that these arguments will help you to notice that probably this is best option at the moment 🤔
Otherwise, we can discuss offline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I have already shared my opinion about how much I like the current solution, I have to stand up for it 😄
I believe the main benefit we have from it is that in case when we do something wrong, we get an error on compilation stage, not in runtime. This may save some time on debugging, especially for the contributors who are just starting to get acquainted with the Snippets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you both like this solution 😁
As I understood after reading the rest of the comments, we are going to introduce loop port classes soon.
In this case, these factory methods are indeed better because they can be easily replaced with dedicated class constructors.

src/common/snippets/include/snippets/lowered/loop_port.hpp Outdated Show resolved Hide resolved
src/common/snippets/src/lowered/loop_port.cpp Outdated Show resolved Hide resolved
src/common/snippets/src/lowered/loop_info.cpp Outdated Show resolved Hide resolved
src/common/snippets/src/lowered/pass/insert_buffers.cpp Outdated Show resolved Hide resolved
@a-sidorova
Copy link
Contributor Author

@IvanNovoselov thanks a lot for the review! Please take a look again 😊

@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Jan 10, 2025
Merged via the queue into openvinotoolkit:master with commit e1357f1 Jan 10, 2025
186 checks passed
ilya-lavrenov added a commit to openvinotoolkit/openvino.genai that referenced this pull request Jan 15, 2025
- Added more models types to WWB tests
- Updated real models to random models in LLM bench tests to save
execution time

After openvinotoolkit/openvino#28310 is merged,
random tiny models are expected to work
MirceaDan99 pushed a commit to MirceaDan99/openvino that referenced this pull request Jan 22, 2025
### Details:
- *Moved fields of the class `LoopPort` to private section and added
getters/setters*
- *Implemented `Type` to `LoopPort` to distinguish not incremented ports
due to double ptr increment and not incremented ports of Brgemm.*
- *These changes fix inefficient calculation of Buffer allocation size
in dynamic MHA Subgraphs. More details are described in the ticket
159913*

### Tickets:
 - *157326*
 - *159913*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants