-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
safe vs. unsafe foreign imports #34
Comments
this sounds like a valid/good idea! What should be the default is a more complex question, but that aside, both styles should be available! |
Unsafe foreign imports which can block for unpredictable amounts of time cause performance problems that only emerge when scaling to multiple cores, because they delay the GC sync. This is a really annoying problem if it happens to you, because it's almost impossible to diagnose, and if it happens due to an unsafe call in a library then it's also really hard to fix. So the safe versions should be the default, and if we have them at all, the unsafe versions should be named |
Agreed, if the call would ever block (and that includes most filesystem functions) that means you want "safe". |
Glad to see people agree :) I'll try and prepare a patch then. |
Even worse, Let me also point out that for many of the functions in |
@Lustin are you going to provide such a patch soon? I think we should try to get this as |
@hvr - yep, now that the weekend has arrived I will - I don't quite have time during the week, sorry :) |
So, question time. I've looked at the unix library code and also a bit at base (thanks @redneb for the hint). 'unix' alone has around 180 unsafe 'foreign imports'. We have a couple of ways to change things.
Given these options, I would propose going with the first one - no choice, but rather an informed switch from unsafe to safe, case-by-case. Howver, I've run some more benchmarks and, on the relatively slow laptop that I tested, criterion gives me:
These kind of slowdowns range from close to noise to very significant, so I'm not sure anymore if we can afford not to give the option of switching, if we find a function call that is fast but could still block. Thoughts on which way to go? So far I've narrowed down the list of potential "dangerous" ffi imports in 'unix' to around 80, mostly related to files, but not only. But still it's a lot of functions that would need to be duplicated… |
I'd vote for providing |
One other option:
|
@cartazio I'm against (on a related note, at some point we might want to add @simonmar the main problem I see with Fwiw, I like the idea of a manual cabal flag to switch to the old |
Fyi, as this isn't a new regression we'll probably not target GHC 7.10.1 for this fix |
will making all the current calls use safe be part of 7.10? |
@hvr: makes sense; also, we shouldn't hurry to fix this and introduce an API that is painful to correct later. The only thing I don't like about the cabal flag is that this will be a bit cumbersome to fix in "base" (which also has this kind of unsafe imports). I mean that I'd be surprised if anyone actually ever rebuilds base in their install, but maybe it does happen. On the other hand, the flag keeps the API clean and simple. In any case, I'll write the cabal patch - it might take a bit to make sure I mark the correct imports safe; we'll see how it goes from there. Thanks for the feeback! |
@cartazio while making all (potentially) blocking calls Maybe we can consider it for GHC 7.10.2 if we have some benchmark data for syscall-heavy real-world (test)programs by then... |
@cartazio the flag approach wouldn't allow that obviously, it's more about being able to turn it off globally and revert to the old unsafe behavior for special use cases (like benchmarking, comparing old/new behaviour, and/or programs where you are confident it won't make any problems) |
The point of |
@simonmar but if we have a separate package anyway why the hassle w/ the flag? |
@hvr the flag means that the difference between the |
@simonmar ...actually two lines (assuming you don't use {;}-syntax, and have a very long single-line .cabal file =) |
Surely just a |
Ah, the package name of course :) |
Would someone wanting to use both need to use package import syntax?
|
yes, as they'd have conflicting module names if the only change is a different |
Would it be possible to come to a consensus on this issue? Is the way forward to make a new |
@glguy - not quite. Not all calls need to be marked safe. My plan, before I dropped the ball on this one, was:
Thanks for bumping this up! |
I noticed this too while preparing a patch to provide The plan above sounds reasonable to me. |
Since connectNamedPipe is basically a wait/sleep function, we need to use *safe* foreign call, because they delay the GC sync. For example see Simon Marlow's comment: haskell/unix#34 (comment)
Fun. So even GHC itself has issues, not just the unix library? I had back in 2015 an in-progress patch, but that's rotten code by now, and I hadn't had the need/motivation to push this further so far. |
Hey Iustin, do you have the old patch somewhere? I may be able to work on this in the future an it would be interesting to see what you did. |
Hey, so inbox cleanup time: I looked for my development tree and I cannot find it anywere, not even in backups. So I think you (or anyone else) should start from scratch… |
Anyone feel like fixing this? We just ran into a real-world case where |
I'm personally very busy currently but FP Complete would be happy to do it for Facebook to fix this (in |
For the record, I found my old patch in a laptop backup :), I can post it for reference (it doesn't merge at all, of course). |
@simonmar On https://gitlab.haskell.org/ghc/ghc/issues/13296#note_132146 you mentioned work on making Is this something that should still be done along with changes in here and GHC to address the performance reservations you had back then, or do you think we should switch |
What’s stopping us from moving the known problematic calls to safe? Their
overhead will be on the order of a quarter microsecond. But that’s fine
…On Tue, Nov 19, 2019 at 7:29 AM Niklas Hambüchen ***@***.***> wrote:
@simonmar <https://github.com/simonmar> On
https://gitlab.haskell.org/ghc/ghc/issues/13296#note_132146 you mentioned
work on making safe syscalls faster (https://phabricator.haskell.org/D1466
- *RFC: foreign import ccall nonblocking*).
Is this something that should still be done along with changes in here and
GHC to address the performance reservations you had back then, or do you
think we should switch unix (or more) to safe even if that work remains
for later?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34?email_source=notifications&email_token=AAABBQSFBSGFWE6LXY3AZKLQUPLYZA5CNFSM4AZWIHN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEOAXOQ#issuecomment-555486138>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQTCMQYB53LVXBQBXXLQUPLYZANCNFSM4AZWIHNQ>
.
|
@cartazio - the conclusion back in 2015 was that the slowdown was 10× between safe and unsafe calls; for some workloads, this might not be a valid trade-off. In any case, I'm attaching the WIP patch I had back then. The work is not very complex, but rather tedious - look at all call sites, and decide (by reading man pages and such) whether that's a safe, unsafe, or interruptible call, and annotate it as such. |
I suspect for the majority of calls in I think it would be nice to work on that patch some more and see if it's worthwhile, but it's probably not something I'll get to unless the performance of safe calls ends up on my critical path somehow. Carter said:
If it is 200ns or so that would clearly be fine, I'm not sure I completely believe that figure though. A safe call can trigger a context switch, but multiple safe calls will still only trigger a single context switch so it doesn't give you a genuine idea of the overhead to time 10^6 safe calls and then divide the time by 10^6. |
The mistake I made back when I did try to fix this was to try to send a single big patch. If flag at build time is (still) OK, then I can send small patches, converting sets of functions over. Another problem is that I have no idea how this could be tested for correctness at all. |
@simonmar - regarding context switches - since many of these calls end up as syscalls, they will involve a context switch anyway, no? I wonder, for example, if it's ever worth for an |
I believe system calls are more efficient than context switches.
The answer here is to measure it :) |
I haven't measured it myself yet, but this answer https://stackoverflow.com/questions/32748530/on-linux-is-access-faster-than-stat/32749760#32749760 suggests that cached syscalls of the If the overhead turns out negligible as we hope for, my favourite outcome would be to use |
I don't think that's a good idea. First, there is the Even with that aside, for things which don't cause a syscall but instead are pure userspace, using That's assuming we can determine what a given libc function does in a reliable way. If not, I agree, safe is the right answer. Hmm, more convinced now :) |
Put another way, I expect proudly probably 60% interruptible calls, 35% safe calls, 5% unsafe (e.g getpid* family). |
By all means If we find it's easy to do it in the same go, I'm for it (I just found in the past that checking whether
Sure, we're still talking about your scope from the issue description:
|
Ah I also just noticed I can make a plug here: Regarding If we don't want to add tests in |
Hi,
I'm a bit surprised that various I/O (in the filesystem sense, not Haskell
IO
) functions are imported asunsafe
. While they won't call back to Haskell, depeding on which filesystem they talk to, they can take arbitrary amounts of time, blocking an execution context for this entire duration.Would it make sense to also have "safe" version of the functions, for the case when performance is less of an issue compared to safety? Would it make sense to at least document that they are imported unsafe?
For the record, I've run a simple benchmark comparing "chmod" performance on various local and remote filesystems and the difference between unsafe and safe is between 10% slower (nfs) to 25% slower (local real filesystems) to 30% (tmpfs). So there is a performance hit indeed.
The text was updated successfully, but these errors were encountered: