Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Calyx] Floating Point Comparisons #7821

Open
jiahanxie353 opened this issue Nov 15, 2024 · 7 comments
Open

[Calyx] Floating Point Comparisons #7821

jiahanxie353 opened this issue Nov 15, 2024 · 7 comments
Assignees

Comments

@jiahanxie353
Copy link
Contributor

I'm going to support the lowering of arith.maximumf to Calyx dialect, for the sake of ReLU operations in neural networks. It's going to be a bit tricky since it involves ordered and unordered comparisons, where there might be NaNs in the latter case.

What do you think about it? If we have the bit vector representation, it's easy to determine NaN by looking at the relevant bits; but how should Calyx determine if it's NaN given that we are passing regular decimal numbers as inputs (like 4.20)? Let's assume it's IEEE754 standard for now.

And I took some inspirations from the existing pass, -arith-expand path, which legalizes operations such as maximumf to the operations that LLVM can support, from:

%2 = arith.maximumf %0, %1 : f32

we get:

%2 = arith.cmpf ugt, %0, %1 : f32
%3 = arith.select %2, %0, %1 : f32
%4 = arith.cmpf uno, %1, %1 : f32
%5 = arith.select %4, %1, %3 : f32

If we have to use similar comparison operators, how does Calyx library ops support unordered comparisons?

Thoughts? @rachitnigam @andrewb1999 @cgyurgyik

@jiahanxie353 jiahanxie353 self-assigned this Nov 15, 2024
@jiahanxie353
Copy link
Contributor Author

Can I tag @ekiwi as well given your expertise? Thanks!

@cgyurgyik
Copy link
Member

Either have a dedicated unordered/ordered comparison (based on some standard, eg ieee754) operations if the hardware supports it, or decompose it into a series of smaller operations

@jiahanxie353
Copy link
Contributor Author

I found this in the Berkeley HardFloat:

module
    compareRecFN#(parameter expWidth, parameter sigWidth) (
        input [(expWidth + sigWidth):0] a,
        input [(expWidth + sigWidth):0] b,
        input signaling,
        output lt,
        output eq,
        output gt,
        output unordered,
        output [4:0] exceptionFlags

How does the following plan sound.

  1. Based on compareRecFN, we create three wrappers in calyx/primitives/float for IEEE754 standards:
    • std_ltFN by extracting output lt and output unordered;
    • std_eqFN by extracting output eq and output unordered;
    • std_gtFN by extracting output gt and output unordered.
  2. The IEEE Standard mandates that equality comparisons ordinarily are quiet, while inequality comparisons ordinarily are signaling. So when we are doing oeq/one/ueq/une, signaling = 0; otherwise, it's 0.
  3. arith::cmpf has the following comparisons and we can compile arith::cmpf to Calyx by the following rows :
    • oeq (ordered equal to): will be lowered to std_eqFN.out && !std_eqFN.unordered;
    • ogt (ordered greater than): will be lowered to std_gtFN.out && !std_gtFN.unordered;
    • oge: will be lowered to !std_ltFN.out && !std_ltFN.unordered;
    • olt: will be lowered to std_ltFN.out && !std_ltFN.unordered;
    • ole: will be lowered to !std_gtFN.out && !std_gtFN.unordered;
    • one: will be lowered to !std_eqFN.out && !std_eqFN.unordered;
    • ord: will be lowered to !std_eqFN.unordered;
    • ueq: will be lowered to std_eqFN.out || std_eqFN.unordered;
    • ugt: will be lowered to std_gtFN.out || std_gtFN.unordered;
    • uge: will be lowered to !std_ltFN.out || std_ltFN.unordered;
    • ult: will be lowered to std_ltFN.out || std_ltFN.unordered;
    • ule: will be lowered to !std_gtFN.out || std_gtFN.unordered;
    • une: will be lowered to !std_eqFN.out || std_eqFN.unordered;
    • uno: will be lowered to std_eqFN.unordered;

@cgyurgyik
Copy link
Member

This seems reasonable to me. Any reason why you prefer three different primitives rather than a single primitive with 3 parameters lt, eq, gt?

Side note: It is surprising that there isn't just two parameters, lt and eq, since gt could be derived from these.

@jiahanxie353
Copy link
Contributor Author

jiahanxie353 commented Nov 16, 2024

Any reason why you prefer three different primitives rather than a single primitive with 3 parameters lt, eq, gt?

I don't have any compelling argument right now, just thought:

  1. it conforms to what Calyx already has for non floating point comparisons.
  2. if we make the comparison primitive uniform, we have to ignore two output ports every time in CIRCT (say we are doing ueq, we only take eq port and unordered port and ignore lt, gt). Not sure what side effect, if any, those dangling ports will bring us during lowering..

So I thought I should bring it up and discuss with you all regarding which decision may be better to take before making an actual PR.

It is surprising that there isn't just two parameters, lt and eq, since gt could be derived from these.

yeah..

@cgyurgyik
Copy link
Member

say we are doing ueq, we only take eq port and unordered port and ignore lt, gt

My understanding is we could just set these to 1'd0?

I'm not saying it is the right choice, but a reason to use the same primitive for all of these is resource sharing in the compiler. At least currently, it cannot infer that std_eq and std_lt are implemented using the same primitive.

@jiahanxie353
Copy link
Contributor Author

but a reason to use the same primitive for all of these is resource sharing in the compiler

I agree, I'm drafting a PR at the moment that only has a single CompareFOp instead of multiple

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants