Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abs implementation in rvv #399

Open
Ag-Cu opened this issue Jun 13, 2024 · 3 comments
Open

abs implementation in rvv #399

Ag-Cu opened this issue Jun 13, 2024 · 3 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Ag-Cu
Copy link

Ag-Cu commented Jun 13, 2024

I see your implementation for rvv, like this:

vabs_s16:                               # @vabs_s16
        vsetivli        zero, 4, e16, m1, ta, ma
        vsra.vi v9, v8, 15
        vxor.vv v8, v8, v9
        vsub.vv v8, v8, v9
        ret

So why we don't just use two instructions: vrsub and vmax to implement abs?

@howjmay
Copy link
Owner

howjmay commented Jun 15, 2024

Hi @Ag-Cu! Could you please provide an implementation for the idea you mentioned?

@Ag-Cu
Copy link
Author

Ag-Cu commented Jun 24, 2024

Sorry, I just saw it. Sure, here is my implementation using handwritten asm, and it works well:

.macro abs d0, s0, t0
    vrsub.vi    \t0, \s0, 0
    vmax.vv     \d0, \s0, \t0
.endm

I am not sure which performance is better, but it does cost fewer instructions.

@howjmay
Copy link
Owner

howjmay commented Jun 24, 2024

I think it should be good. The only thing I may concern right now is what is the behavior of this implementation when overflow happened. In other words. For abs_s8, what is the result of -128 is given?

@howjmay howjmay added help wanted Extra attention is needed good first issue Good for newcomers labels Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants