Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't restart from floating point errors #961

Closed
Bike opened this issue Mar 30, 2020 · 16 comments
Closed

Can't restart from floating point errors #961

Bike opened this issue Mar 30, 2020 · 16 comments

Comments

@Bike
Copy link
Member

Bike commented Mar 30, 2020

For example,

(handler-bind ((arithmetic-error #'abort))
  (restart-case
      (integer-decode-float #.ext:short-float-positive-infinity)
    (abort ())))

will signal an out of extent unwind. Something weird with signal handling, probably.

@Bike
Copy link
Member Author

Bike commented Apr 2, 2020

The backtrace as obtained by btcl or c_btcl has the frame for the function that establishes the restart, but it still fails. Don't know how that's possible? The stack being copied? I really have no idea

@Bike
Copy link
Member Author

Bike commented Jan 8, 2021

Seems to be just crashing now.

@Bike Bike added the crash label Jan 8, 2021
@kpoeck
Copy link
Contributor

kpoeck commented Feb 2, 2021

with the change from longjmp to _longjmp I get again: (did also a fprintf in handle_fpe to see if the hander is called

 (handler-bind ((arithmetic-error #'abort))
  (restart-case
      (integer-decode-float #.ext:short-float-positive-infinity)
    (abort ())))
Enter handle_fpe Signo: 8 Errno:0 Code:0

Condition of type: OUT-OF-EXTENT-UNWIND
Attempted to return or go to an expired block or tagbody tag.
Available restarts:
(use :r1 to invoke restart 1, etc.)

1. (ABORT) ABORT
2. (RESTART-TOPLEVEL) Go back to Top-Level REPL. ```` 

Bike added a commit that referenced this issue Oct 21, 2021
@kpoeck
Copy link
Contributor

kpoeck commented Oct 22, 2021

this is now fixed, isn't it?

@Bike
Copy link
Member Author

Bike commented Oct 22, 2021

With some of the optimizations I've been doing I've gotten a crash as Clasp doesn't seem to recognize its own condition signaled from the SIGFPE handler. Which isn't quite the same, but I'm in general still pretty suspicious of how we handle traps.

@Bike
Copy link
Member Author

Bike commented Oct 29, 2021

To elaborate: On linux I do

(defun foo (x y)
  (declare (single-float x y))
  (ext:with-float-traps-masked () (/ x y)))
(foo 1.0 0.0)

and it crashes while trying to print a backtrace for the division-by-zero.

Note that you can't just do (/ 1.0 0.0), because LLVM will constant fold that into a NaN. (Which is something else we might need to rethink.)

Bike added a commit that referenced this issue Oct 29, 2021
This causes code with unmasked FP exceptions to try to signal,
which can crash per #961. This is a problem, but I have just
commented out the tests for now.
@kpoeck
Copy link
Contributor

kpoeck commented Oct 30, 2021

If I try this in main on macos I get:

COMMON-LISP-USER> (defun foo (x y)
  (declare (single-float x y))
  (ext:with-float-traps-masked () (/ x y)))
; caught ERROR:
;   BUG: Bad rtype (SINGLE-FLOAT)
;     at unknown location
; 

Condition of type: SIMPLE-ERROR
BUG: Bad rtype (SINGLE-FLOAT)
Available restarts:
(use :r1 to invoke restart 1, etc.)

1. (RESTART-TOPLEVEL) Go back to Top-Level REPL.


(CLASP-CLEAVIR::INITIALIZE-IBLOCK-TRANSLATION #<IBLOCK NIL>)
COMMON-LISP-USER>> :b

   4: (CLASP-CLEAVIR::INITIALIZE-IBLOCK-TRANSLATION #<IBLOCK NIL>)
   6: ((LAMBDA (CLASP-CLEAVIR::IB)) #<IBLOCK NIL>)
   7: (CLEAVIR-BIR:MAP-IBLOCKS #<FUNCTION (LAMBDA (CLASP-CLEAVIR::IB))> #<FUNCTION FOO>)
   9: ((LAMBDA NIL))
  11: (LAMBDA (&OPTIONAL CORE:FILE-SCOPE COMPILER::FILE-HANDLE &REST #:G64319))
  12: (COMPILER::DO-DBG-FUNCTION #<FUNCTION (LAMBDA NIL)> 999909 #<FUNCTION-TYPE { i8*, i64 } (i8*, i64, i8*, i8*, i8*, i8*, ...)> #<FUNCTION FOO^COMMON-LISP-USER^FN^^-lcl>)
  14: (CLASP-CLEAVIR::LAYOUT-MAIN-FUNCTION #<FUNCTION FOO> FOO #<CLASP-CLEAVIR::ABI-X86-64>)
  16: (CLASP-CLEAVIR::LAYOUT-PROCEDURE #<FUNCTION FOO> FOO #<CLASP-CLEAVIR::ABI-X86-64> :LINKAGE LLVM-SYS:INTERNAL-LINKAGE)
  17: (CLASP-CLEAVIR::LAYOUT-MODULE #<CLEAVIR-BIR:MODULE> #<CLASP-CLEAVIR::ABI-X86-64> :LINKAGE LLVM-SYS:INTERNAL-LINKAGE)
  19: (CLASP-CLEAVIR::TRANSLATE #<FUNCTION (LAMBDA NIL)> :ABI #<CLASP-CLEAVIR::ABI-X86-64> :LINKAGE LLVM-SYS:INTERNAL-LINKAGE)
  21: (CLASP-CLEAVIR::TRANSLATE-AST #<FUNCTION-AST (LAMBDA NIL) NIL @0x12026dc01>)
  23: ((LAMBDA NIL))
  25: (LITERAL::DO-RTV #<FUNCTION (LAMBDA NIL)>)
  27: (FLET #:G65215)
  28: (CLASP-CLEAVIR::BIR-COMPILE-CST #<CONS-CST raw: (LAMBDA NIL (PROGN #'(LAMBDA (X Y) (DECLARE (CORE:LAMBDA-NAME FOO) (TYPE SINGLE-FLOAT Y) (TYPE SINGLE-FLOAT X)) (BLOCK FOO (EXT:WITH-FLOAT-TRAPS-MASKED NIL (/ X Y)))))) @0x12290a331> NIL "APP-FASL:CCLASP-BOEHM-IMAGE.FASP.NEWEST" :LINKAGE LLVM-SYS:INTERNAL-LINKAGE :NAME NIL)
  30: ((LAMBDA NIL))
  32: (COMPILER::CALL-WITH-COMPILATION-RESULTS #<FUNCTION (LAMBDA NIL)>)
  34: (COMPILER::COMPILE-WITH-HOOK #<FUNCTION CLASP-CLEAVIR::BIR-COMPILE-CST> #<CONS-CST raw: (LAMBDA NIL (PROGN #'(LAMBDA (X Y) (DECLARE (CORE:LAMBDA-NAME FOO) (TYPE SINGLE-FLOAT Y) (TYPE SINGLE-FLOAT X)) (BLOCK FOO (EXT:WITH-FLOAT-TRAPS-MASKED NIL (/ X Y)))))) @0x12290a331> NIL "APP-FASL:CCLASP-BOEHM-IMAGE.FASP.NEWEST" :LINKAGE LLVM-SYS:INTERNAL-LINKAGE :NAME NIL)
  35: ((LAMBDA NIL))
  37: (COMPILER::DO-COMPILATION-UNIT #<FUNCTION (LAMBDA NIL)>)
  39: (COMPILER:COMPILE-IN-ENV #<CONS-CST raw: (LAMBDA NIL (PROGN #'(LAMBDA (X Y) (DECLARE (CORE:LAMBDA-NAME FOO) (TYPE SINGLE-FLOAT Y) (TYPE SINGLE-FLOAT X)) (BLOCK FOO (EXT:WITH-FLOAT-TRAPS-MASKED NIL (/ X Y)))))) @0x12290a331> NIL #<FUNCTION CLASP-CLEAVIR::BIR-COMPILE-CST> LLVM-SYS:INTERNAL-LINKAGE)
  41: (CLASP-CLEAVIR::BIR-COMPILE-CST-IN-ENV #<CONS-CST raw: (LAMBDA NIL (PROGN #'(LAMBDA (X Y) (DECLARE (CORE:LAMBDA-NAME FOO) (TYPE SINGLE-FLOAT Y) (TYPE SINGLE-FLOAT X)) (BLOCK FOO (EXT:WITH-FLOAT-TRAPS-MASKED NIL (/ X Y)))))) @0x12290a331> NIL)
...

slightly puzzeled

@kpoeck
Copy link
Contributor

kpoeck commented Oct 30, 2021

Did a distclean and full rebuild before

@Bike
Copy link
Member Author

Bike commented Oct 30, 2021

odd. that should be fixed by 867b986

@kpoeck
Copy link
Contributor

kpoeck commented Oct 30, 2021

Will check whether I have the commit you mentionned

@kpoeck
Copy link
Contributor

kpoeck commented Oct 30, 2021

and with the really latest that works fine, sorry for the noise

@Bike
Copy link
Member Author

Bike commented Oct 30, 2021

Do you still get the crash?

@kpoeck
Copy link
Contributor

kpoeck commented Oct 31, 2021

All seem to work fine:

COMMON-LISP-USER>> (defun foo (x y)
  (declare (single-float x y))
  (ext:with-float-traps-masked () (/ x y)))

FOO
COMMON-LISP-USER>> (foo 1.0 0.0)

Debugger received error of type: DIVISION-BY-ZERO
Condition of type DIVISION-BY-ZERO was signaled.
Error flushed.
COMMON-LISP-USER>> :r1

COMMON-LISP-USER> (handler-bind ((arithmetic-error #'abort))
  (restart-case
      (integer-decode-float #.ext:short-float-positive-infinity)
    (abort ())))

Condition of type: SIMPLE-PROGRAM-ERROR
Can't decode NaN or infinity inf
Available restarts:
(use :r1 to invoke restart 1, etc.)

1. (ABORT) ABORT
2. (RESTART-TOPLEVEL) Go back to Top-Level REPL.


(CORE:SIGNAL-SIMPLE-ERROR CORE:SIMPLE-PROGRAM-ERROR NIL "~a" ("Can't decode NaN or infinity inf"))
COMMON-LISP-USER>> :r1

NIL

@Bike
Copy link
Member Author

Bike commented Oct 31, 2021

Huh. Is this on Linux or Mac or what?

@kpoeck
Copy link
Contributor

kpoeck commented Oct 31, 2021

on mac

@yitzchak
Copy link
Member

Fixed by #1619

COMMON-LISP-USER> (handler-bind ((arithmetic-error #'abort))
  (restart-case
      (integer-decode-float #.ext:short-float-positive-infinity)
    (abort ())))

Condition of type: SIMPLE-PROGRAM-ERROR
Can't decode NaN or infinity inf
Available restarts:
(use :r1 to invoke restart 1, etc.)

1. (ABORT) ABORT
2. (RESTART-TOPLEVEL) Go back to Top-Level REPL.

ERROR
COMMON-LISP-USER>> :r2
COMMON-LISP-USER> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants