-
Notifications
You must be signed in to change notification settings - Fork 24
Illegal instruction with a cross-compiled node.js binary on a Freescale Qoriq CPU #30
Comments
I'm fairly certain that this specific CPU doesn't implement the FPU instructions that the V8 runtime (inside of Node.js) expects to be there. This is effectively a duplicate of a couple of bugs against the v8ppc code https://github.com/andrewlow/v8ppc/issues?state=open I'll leave it open as a catcher for people trying to get Node on the Synology devices. You do have the option of building a 'simulated' version (I suggest you read through some of the open issues for help on that). It'll run about 6x slower than native, but it will work. Yes - eventually we will address this. It's a matter of priorities. |
I ran into this on my G5 trying to build nodejs as a build dependency for firefox. The illegal instruction I end up halting on is friz. Just out of curiosity, is there any intrinsic value in using the POWER5+ only instruction for rounding doubles to integers as opposed to using the PPC/PPC64 equivalent? I think the older instructions work with POWER as well (they were in POWER4, I think), so it seems that the only reason to use this instruction is to make it incompatible with PowerPC machines. Obviously, I don't think that was the intent, but it seems odd to use this instruction instead. Edit: ahh, from the looks of it, friz takes in an optional bit whether or not to set the state in condition register 1. From what I've read, this condition register has effects on the overall superscalar pipeline. So perhaps the instructions per clock count ends up taking a dip. Still, given that there's hacks in v8 to prevent frim from happening, I think it's just an oversight that friz is being called. |
Hi @KungFuJesus, yeah, using As for the Andrew's suggestion about using simulator would still work, but it's not a performant option. Given that things haven't really changed in this regards since 2014, not sure when we'd be able to prioritize this work... That being said, this is an open-source project after all...contributions are welcomed! ;-) |
Now that I need this to even attempt to build firefox (supposedly skia support was fixed for big endian architectures at some point, I'll have to see about this), I'm probably going to make an attempt at least at a patch. A very significant difference I also noticed is that the PPC variations of those instructions are floating point to fixed point instructions, rather than floating point to floating point. I think converting back to fp shouldn't be too bad, but it will definitely need to happen within the compiler's code generation (compiler/ppc/code-generator-ppc.cc) rather than emitting different instructions conditionally at the assembler bits. Effectively, as far as I can tell, frim and friz are being used strictly in rounding instructions rather than to integer conversions. I think this is the only entry point I need to stub in code for (correct me if I'm wrong). It would be really handy if some of this code generation could fall back to LLVM to generate these bits of the JIT, but I suppose that's somewhat suboptimal as well, only allowing for function level abstractions rather than generic blocks. I know, optimal dynarec'ing JITs are hard, but it does irritate me the level of effort it takes to bootstrap something like firefox when 6 years ago there were readily available PPC Linux builds. I appreciate the effort porting v8/nodejs to POWER, that should have been most of the heavy lifting. |
This is my naive attempt at reimplementing each of the four instructions using the PowerPC instructions. I have admittedly never touched an FPU in assembly language (on any CPU) before today. In my own testing, comparing to the musl C library's implementations of floor/round/ceil/trunc, they give me the same results on a POWER9 as the single instructions do, with the exception of values in [-0.0..-0.9]. In these cases, this version loses the signedness of the -0.0 result. I don't know how to wire this up to v8's JIT. Since the flag is there, I know it should look something like: case kPPC_TruncateDouble:
if (CpuFeatures::IsSupported(FPU)) {
ASSEMBLE_FLOAT_UNOP_RC(friz, MiscField::decode(instr->opcode()));
} else {
/* stuff goes here */
} but that's as far as I have figured out as of yet. Since pasting this code into a comment on GitHub makes it ambiguous, I release this code under both the BSD-3-Clause and MIT license. _set_round_mode:
; in = r3 = the bits to set
; out = none
; side-effects = FPCSR [RN] = r3
stdu 1,-128(1)
stw 3,116(1)
li 3,0
stw 3,112(1)
lfd 6,112(1)
mtfsf 0x01,6
ld 1,0(1)
blr
.global frim
.type frim,@function
frim:
stdu 1,-128(1)
mflr 0
std 0,144(1)
; CR7: Saved [RN]
mcrfs 7,7
mfocrf 3,0x01
ori 3,3,0b11
bl _set_round_mode
nop
fctid 1,1
fcfid 1,1
; put it back
mfocrf 3,0x01
bl _set_round_mode
nop
ld 0,144(1)
mtlr 0
ld 1,0(1)
blr
.global frip
.type frip,@function
frip:
stdu 1,-128(1)
mflr 0
std 0,144(1)
mcrfs 7,7
mfocrf 3,0x01
andi. 3,3,0b1100
ori 3,3,0b10
bl _set_round_mode
nop
fctid 1,1
fcfid 1,1
mfocrf 3,0x01
bl _set_round_mode
nop
ld 0,144(1)
mtlr 0
ld 1,0(1)
blr
.global friz
.type friz,@function
friz:
stdu 1,-128(1)
mflr 0
std 0,144(1)
mcrfs 7,7
mfocrf 3,0x01
andi. 3,3,0b1100
ori 3,3,1
bl _set_round_mode
nop
fctid 1,1
fcfid 1,1
mfocrf 3,0x01
bl _set_round_mode
nop
ld 0,144(1)
mtlr 0
ld 1,0(1)
blr
.global frin
.type frin,@function
frin:
stdu 1,-128(1)
mflr 0
std 0,144(1)
mcrfs 7,7
mfocrf 3,0x01
andi. 3,3,0b1100
bl _set_round_mode
nop
fctid 1,1
fcfid 1,1
mfocrf 3,0x01
bl _set_round_mode
nop
ld 0,144(1)
mtlr 0
ld 1,0(1)
blr |
Having NAS based on POWER QUICC III. Also getting "Illegal instruction" after npm run. Should one hold out any hope for future implementation processor specific instruction? |
We have not had the opportunity to add the fallback paths as @awilfox had developed... we'd need to address the [-0.0..-0.9] results also. My bigger concern is that the latest Node.js and V8 itself support 64-bit only on POWER. 32-bit paths are neither tested nor maintained moving forwards -- even if we integrate the fallback paths for the FPU instructions as above into V8 master, there are no guarantees the rest of V8 will continue to work on 32-bit micro controllers. |
Thanks for your reply. |
Hi @KungFuJesus @joransiu , I found this bug while searching for "friz" because I just ran into this very issue trying to bootstrap Node.js in Gentoo Linux on a PowerMac G5 (ppc64). I'll work on my own fallback patch based on the comments in this thread. I also wanted to note that ppc64 big-endian is ELFv1, which uses function descriptors, so I had to make a few small local patches (which I'll make a pull request for after some testing) to check for Also, two places in the Node.js 14.15.0 code assume the minimum physical page size is 64KB instead of 4KB for both ppc and ppc64, which caused a debug assertion failure in the first case I noticed (the other I found in code inspection). There's an open bug for at least the nouveau X.org driver (which I'm using on my G5) that it won't work if you change the physical page size from 4KB to 64KB. |
It's worth noting that this is not always true - only legacy distributions continue to use ELFv1 on BE. Modern distributions like void-ppc and all musl-based distributions like Adélie always use ELFv2, regardless of endianness. I believe there are plans for Gentoo to publish an ELFv2 stage3 tarball for BE as well too. |
That's very interesting to know, thanks! It's very awkward to transition, though, if you can't have both types on the same system disk at the same time. I know there aren't a lot of precompiled binaries for Linux/ppc, but if there's no good multilib solution to have ELFv1 and ELFv2 libraries at the same time, it seems problematic. I'll definitely keep an eye on void-ppc. |
I'm just about to test a patch I've been working on to hopefully tell V8 not to generate the FP rounding instructions if the CPU doesn't support them. First I worked on extending a patch I got from @awilfox to generate a fallback code sequence, but it ended up having branches and looked ugly and I'm not sure it's correct. Then today I realized that the x64 backend sets the flags for Presumably there's a generic backend for the rounding operations that will kick in when I tell V8 that my G5 doesn't have those operations. I'll submit a pull request after I've tested all of this. |
Awesome - compiling this JIT even in emulation mode has confounded my ability to build firefox (not sure why node.js for firefox became a thing) for a while. |
Hello, as a reference I just wanted to point out a similar PR we had on nodjes regarding ppc64 ELFv1 and function descriptors: |
Ok, I have an initial PR that solves the codegen problem by setting the CPU flags to not generate the instructions when it doesn't have them. I'm still testing it, but I'm happy to share and get feedback, especially to make sure it doesn't break the newer systems. I noticed the CPU detection logic wasn't setting the feature flags for POWER9, and it looked to me like it should be setting everything that POWER8 has. nodejs/node#35988 |
Wonder what value kMaxRegularHeapObjectSize should take if patch narrowing page size? Or it has no effect. |
Promising so far (before vs after patch). I'll do a bit more testing - is there anything simple I should try in the interpreter? |
I've compiled node 0.11.12-release-ppc on Ubuntu 12.04 on Vagrant on OS X. The target device is a Synology DS413 (NAS). I used the proper cross-compilation toolchain by Synology.
Compilation succeeded only without a snapshot.
The illegal instruction (or SIGILL) occurs when no arguments are given, but never if given -v or -h. I have not gone through all possible CLI argument combinations.
With gdb on the NAS I've tried to locate the point where the SIGILL occurs, but have failed to reach the goal. If I ask for the backtrace right after the SIGILL, this is how it looks:
Program received signal SIGILL, Illegal instruction.
0x33c2f5e0 in ?? ()
(gdb) backtrace
#0 0x33c2f5e0 in ?? ()
#1 0x33c2f1f4 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Running gdb I found that there are lots of things done in the 'v8' namespace C++ code which makes me think the v8 that comes with this node.js fork has not been compiled for this CPU - no surprise - and I haven't succeeded in compiling v8ppc for it yet.
To fix this, should I find out if the problem really is in v8 (how?), and therefore compile v8ppc for this CPU, or do something to node?
The text was updated successfully, but these errors were encountered: