-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio: DRC: Change DRC to use lookup table based sine function #8491
Conversation
src/math/lut_trig.c
Outdated
#define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
||
/* An 1/4 period of sine wave as Q1.31 */ | ||
const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Should test if this could be int16_t and size 257 for 16 bit sine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one is check wether can use 16-bit, another thing is whether can shorten the size? do we really need 512?
such a small usage cost 2k bytes table, it is big cost.
Also, could you add comments for how this table calculated? sin(i/2pi)? then we can further figure out whether 512 is must.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to ensure the accuracy,and I guess maybe it is because the latency is 512 point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried 256 size but the quality was worse than with default cordic algorithm based 16 bit sine. So, the table still has 512 elements but uint16_t type was sufficient, so the table is now half of previous.
I added also static.
src/include/sof/audio/drc/drc_math.h
Outdated
int32_t sin_val = sin_fixed_16b(denorm_x); | ||
|
||
return sin_val << 16; | ||
return sofm_lut_sin_fixed_32b(denorm_x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of the tooling might complain about a missing line between variable definitions and statements - same below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I didn't notice first. You are right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a float version drc source code? I want to first check float version, the fixed version, then optimized version.
src/math/lut_trig.c
Outdated
#define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
||
/* An 1/4 period of sine wave as Q1.31 */ | ||
const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one is check wether can use 16-bit, another thing is whether can shorten the size? do we really need 512?
such a small usage cost 2k bytes table, it is big cost.
Also, could you add comments for how this table calculated? sin(i/2pi)? then we can further figure out whether 512 is must.
src/math/Kconfig
Outdated
@@ -9,6 +9,15 @@ config CORDIC_FIXED | |||
Select this to enable sin(), cos(), asin(), acos(), | |||
and cexp() functions as 16 bit and 32 bit versions. | |||
|
|||
config LUT_TRIG_FIXED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a cleanup at some point where we have MATH_TRIG_
prefix and likewise convention for all maths APIs, macros, Kconfigs etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, done
src/math/lut_trig.c
Outdated
#define SOFM_LUT_SINE_SIZE (SOFM_LUT_SINE_NQUART + 1) | ||
|
||
/* An 1/4 period of sine wave as Q1.31 */ | ||
const int32_t sofm_lut_sine_table[SOFM_LUT_SINE_SIZE] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static ?
src/math/Kconfig
Outdated
Select this to enable sofm_lut_sin_fixed_32b() function. The | ||
calculation is using 1/4 wave lookup and interpolation. | ||
This option consumes 2052 bytes .bss RAM for the lookup | ||
table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we offer advice on when each trig type should be used. i.e. I would expect we export the same public API for all maths, but the internal calculations will depend on which Kconfig is selected by the user at build time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some text to Kconfig about preferring the lookup sine when used in hot code parts.
8bc42e4
to
02d7c32
Compare
02d7c32
to
7c8ae5f
Compare
There used to be long ago in git first version that was float C by Sebastiano and Johny but it was replaced by fixed point code when the work proceeded. A float version should be found from ChromeOS sources. The float code and fixed conversion work for this contribution is owned by team Google so we don't review it here. I've used the scripts in tools/test/audio to evaluate objective steady signal audio parameters for DRC and we've not seen difference in team Intel's optimizations. DRC has complex transient signal characteristics, and we currently don't have other but subjective expert listening test method for that. It means for me to listen an album of music with this processing in DUT and try to spot any issues. |
zephyr/CMakeLists.txt
Outdated
zephyr_library_sources_ifdef(CONFIG_MATH_LUT_TRIG_FIXED | ||
${SOF_MATH_PATH}/lut_trig.c | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this next to the other SOF_MATH_PATH (#8620)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong PR? There's no change to CMakeLists.txt in that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed the changes and they appear to be in good standing. I hope that LUT's size is good to others.
src/audio/drc/Kconfig
Outdated
@@ -3,6 +3,7 @@ | |||
config COMP_DRC | |||
bool "Dynamic Range Compressor component" | |||
select CORDIC_FIXED | |||
select MATH_LUT_TRIG_FIXED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be MATH_LUT_SINE_FIXED
instead of MATH_LUT_TRIG_FIXED
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll change the config name. There is no need for other functions now.
src/math/Kconfig
Outdated
@@ -9,6 +9,16 @@ config CORDIC_FIXED | |||
Select this to enable sin(), cos(), asin(), acos(), | |||
and cexp() functions as 16 bit and 32 bit versions. | |||
|
|||
config MATH_LUT_TRIG_FIXED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be MATH_LUT_SINE_FIXED
instead of MATH_LUT_TRIG_FIXED
?
I tried 256 size LUT but the quality dropped a lot. With 512 the quality is a tiny bit better than in 16 bit cordic, so there should be no negative audio quality impact from this. |
This patch adds function sofm_lut_sin_fixed_16b(). It was used earlier in SOF with name sin_fixed() but was remove at add of Cordic trigonometric library. This sine function can be used in hot code parts. Due to look-up table usage it consumes more .bss RAM than cordic version. Signed-off-by: Seppo Ingalsuo <[email protected]>
7c8ae5f
to
13c1725
Compare
/* Q4.28 x Q12.20 -> Q16.48 --> Q16.31*/ | ||
idx_tmp = ((int64_t)w * SOFM_LUT_SINE_C_Q20) >> 17; | ||
idx = (idx_tmp >> 31); /* Shift to Q0 */ | ||
frac = (int32_t)(idx_tmp - (idx << 31)); /* Get fraction Q1.31*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems to boil down to
idx_tmp - (idx_tmp & 0xffffffff80000000ULL) ==
idx_tmp & 0x7fffffff
would the compiler optimise that out by itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that but it looks awkward in arithmetic that is not bit-banging to HW registers etc. The shifts have association to Qx.y format. But if that gives MCPS advantage I can change, and comment what happens, I'll try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this modification 69.976 to 69.925 MCPS, not worth it I think, because of bit-and awkwardness here. I think our perf measurement works down to 0.1 MCPS level, below that it's probably noise.
int theta; | ||
|
||
for (theta = 0; theta < 360; ++theta) { | ||
double rad = _M_PI * (theta / 180.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully the compiler will calculate _M_PI / 180.0
at compile time, but parentheses might actually prevent it from doing that and make it a (redundant) run-time calculation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cmocka test code so we don't care about performance even if there would be emulated floats.
The test function is based on test function for the cordic sine function. The error tolerance is adjusted to just pass. Signed-off-by: Seppo Ingalsuo <[email protected]>
This change saves in TGL platform about 13 MPCS, from 83 to 70 MCPS. In MTL platform the saving is 12 MCPS, from 46 to 34 MCPS. The .bss RAM usage increases by 1 kB from selecting CONFIG_MATH_LUT_SINE_FIXED. Signed-off-by: Seppo Ingalsuo <[email protected]>
13c1725
to
3d1d453
Compare
@singalsu can you check CI. Thanks ! |
This change saves in TGL platform about 13 MPCS, from 83 to 70 MCPS. In MTL platform the saving is 12 MCPS, from 46 to 34 MCPS.
The .bss RAM usage increases by 1 kB from selecting CONFIG_MATH_LUT_TRIG_FIXED.