-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU] Add Clamp for FakeConvertDecomposition #28651
base: master
Are you sure you want to change the base?
[CPU] Add Clamp for FakeConvertDecomposition #28651
Conversation
@@ -175,6 +175,9 @@ std::vector<std::string> disabledTestPatterns() { | |||
R"(.*ConvertCPULayerTest.*outFmts=(nhwc|nChw8c|nChw16c).*)", | |||
// Issue: MFDNN-12917. The oneDNN emitter of conversion from fp32 to fp8 has rounding issue. | |||
R"(.*ConvertCPULayerTest.*(\[1.1.1080.1920\]|\(2.17.5.4\))_.*_inputPRC=f32_targetPRC=f8e4m3_.*)", | |||
// Issue: 123320 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These bf16 related test cases for FakeConvert are skipped because of the following code block, which has already been tracked by ticket 123320.
The code block here
openvino/src/plugins/intel_cpu/tests/functional/shared_tests_instances/core_config.cpp
Lines 21 to 26 in 7f56fcd
// todo: issue: 123320 | |
if (!((inf_prc != config.end() && inf_prc->second == element::undefined) | |
|| (inf_prc == config.end() && exec_mode != config.end() && exec_mode->second == hint::ExecutionMode::ACCURACY))) { | |
test->convert_precisions.insert({ov::element::bf16, ov::element::f32}); | |
test->convert_precisions.insert({ov::element::f16, ov::element::f32}); | |
} |
Here is a case for FakeConvert layer. data=3504, scale=0.546875, shift=0, destination_type=f8e5m2. Expected result is 3264, actual result is 3744, but both correct.
Expected: bf16(3504) -> fp32(3504) -> *scale(0.546875) ->fp32(1916.25) -> fp16(1916) -> fp16(constrained_in_f8_range)(1792) -> fp32(1792) -> /scale(0.546875) -> fp32(3276.8) -> bf16(3264)
Actual: bf32(3504) -> *scale(0.546875) -> bf16(1920) -> fp16(1920) -> fp16(constrained_in_f8_range)(2048) -> bf16(2048) -> /scale(0.546875) ->bf16(3744)
@@ -48,6 +48,7 @@ void emulate_f8e5m2_on_fp16(const float16* const arg_f, float16* out_f, size_t c | |||
val_bit_repr += (((rnmask > 0x0080) || (rnmask_tie == rne_tie)) << lshift); | |||
} | |||
val_bit_repr &= mask_mant; /* truncation */ | |||
val_bit_repr -= (((val_bit_repr & 0x7F00) == fp16_inf) << lshift); /* clamp */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, the behavior of master branch for converting different f32 values to f8e4m3 is as follows:
Value type | Convert f32 to f16 | Convert f16 to f8e5m2 |
---|---|---|
f16 overflowed value | INF | INF (this PR change it to not-INF) |
f8e5m2 overflowed value | not-INF | not-INF |
f8e5m2 other value | not-INF | not-INF |
For overflowed values, if we apply clamp in FakeConvertDecomposition we gets not-INF, otherwise we gets INF. We can not get different behavior for "f16 overflowed value" and "f8e5m2 overflowed value".
So here, I applies this revision to make "f16 overflowed value" also get not-INF result. The other reason is to align with behavior of f8e4m3.
@dmitry-gorokhov @mitruska Please feel free to comment. Thanks a lot!
876d289
to
a6b2559
Compare
a6b2559
to
98c1d2a
Compare
98c1d2a
to
39f30cf
Compare
39f30cf
to
5589947
Compare
@dmitry-gorokhov Could you please take a look? |
Details:
Tickets: