You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got a similar problem on forking with the 'daemonize' crate. The parent process hangs on recvmsg() when the child exits without sending any message. The expected outcome here would be that the recv() returns an `Err(Disconnected)', but it just hangs/blocks.
After some investigation I found following problem:
One does (with a hypothetical fork() that returns Child/Parent)
let (tx, rx) = ipc::channel::<u8>().unwrap();
match fork() {
Child => std::process::exit(0);
Parent => println!("{:?}", rx.recv(rx)) // This will hang instead returning Err(Disconnected)
}
With the result that we now have 2 processes each start with rx and tx open. Calling exit() on the child will close the descriptors in the child, but the sending side (tx) is still open in the parent itself, which causes it to hang in recv().
Actually a manual drop(tx); ... rx.recv() in the parent fixes the problem in this case (as one would do in C). Unfortunately this hack isn't always possible. Imagine your fork() is behind some API that takes an closure that takes ownership of tx.
Keeping 'tx' by cloning to drop it later wont help either. Internally it has an 'Arc' and thus it would leak the refcount only instead closing the fd.
A possible fix (immature, up for discussion) would be to add some facility to get an raw fd type that can be only closed.
let (tx, rx) = ipc::channel::<u8>().unwrap();
let closing_side = tx.closing_fd(); // fn closing_fd(&self) -> ClosingFD
spawn_process(move || { ...may exit... ; tx.send(...); } ) // this consumes tx and creates a child process
// parent continues here
closing_side.close();
let result = rx.recv(); // should not block now
ClosingFd stores only an RawFd and could have the methods
fn close(self) consuming self, closing the fd
fn forget(self) consuming self, but not closing it
may implement Drop .. as close()?
discussion: do these methods need to be unsafe? they don't violate rusts memory safety but there are certain contracts which must be manually ensured. Otherwise operation on an original descriptor may yield EBADF or even worse bad data because the closed fd got reused by the system. The whole issue here is about kernel resources which become out of sync with rusts view of the world because it is unaware about forking.
The whole thing will be needed for the other direction as well (parent sends to child)
The text was updated successfully, but these errors were encountered:
cehteh
changed the title
IpcReceiver hang, problem/discussion
IpcReceiver hang, forking leaking file descriptors, problem/discussion
Sep 17, 2021
Note that this is a breaking change any code that did something like
let (tx, rx) = ipc::channel::<u8>().unwrap();
needs to destructure the return:
let TxRxPair(tx, rx) = ipc::channel::<u8>().unwrap();
With this we can (schematic code)
impl TxRxPair<T> {
pub fn for_forking(self) -> TxRxForForking<T>
}
// ForForking includes the actual endpoint and the crossed ClosingFD
struct TxRxForForking<T>(pub ForForking<T, F>, pub ForForking<T, F>);
struct ForForking<T, F> { open_end: F<T>, closing_end: ClosingFD};
impl ForForking<T,F> {
// to be called after fork, closes the unneeded fd and returns the actual endpoint
pub fn get(self) -> F {
self.closing_end.close();
self.open_end
}
}
altogether this can then be used in a simple way:
// or reverse TxRxForForking(parent, child)
let TxRxForForking(tx_child, rx_parent) = ipc::channel::<u8>().unwrap().for_forking();
match fork() {
Child => {
let tx = rx_child.get();
tx.send(123);
}
Parent => {
let rx = rx_parent.get();
assert_eq!(rx.recv().unwrap(), 123);
}
}
Since the datastructures are now real types other/foreign/extension crates may implement traits that extend these, for example to fork/daemonize a process.
Related to #266 and possibly #201
I got a similar problem on forking with the 'daemonize' crate. The parent process hangs on
recvmsg()
when the child exits without sending any message. The expected outcome here would be that the recv() returns an `Err(Disconnected)', but it just hangs/blocks.After some investigation I found following problem:
One does (with a hypothetical fork() that returns Child/Parent)
With the result that we now have 2 processes each start with rx and tx open. Calling
exit()
on the child will close the descriptors in the child, but the sending side (tx) is still open in the parent itself, which causes it to hang in recv().Actually a manual
drop(tx); ... rx.recv()
in the parent fixes the problem in this case (as one would do in C). Unfortunately this hack isn't always possible. Imagine your fork() is behind some API that takes an closure that takes ownership oftx
.Keeping 'tx' by cloning to drop it later wont help either. Internally it has an 'Arc' and thus it would leak the refcount only instead closing the fd.
A possible fix (immature, up for discussion) would be to add some facility to get an raw fd type that can be only closed.
ClosingFd
stores only an RawFd and could have the methodsfn close(self)
consuming self, closing the fdfn forget(self)
consuming self, but not closing itThe text was updated successfully, but these errors were encountered: