Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: Spawn new write task while buf is not empty #6330

Closed
wants to merge 1 commit into from

Conversation

Xuanwo
Copy link

@Xuanwo Xuanwo commented Feb 7, 2024

Motivation

Fix #6325

tokio::fs could have data loss if flush failed.

Solution

This PR changed the poll_flush to rebuild the write buf is still empty.

To make it possible, I store an Arc<StdFile> in State. This change also remove the need to close file for most IO operations (alought it doesn't has big impact on the performance).

@Xuanwo Xuanwo changed the title fix(fs): Update last write err if last write job failed fs: Spawn new write task while buf is not empty Feb 7, 2024
@Xuanwo
Copy link
Author

Xuanwo commented Feb 7, 2024

The current implementation causes into_std to break:

pub async fn into_std(mut self) -> StdFile {
    self.inner.get_mut().complete_inflight().await;
    Arc::try_unwrap(self.std).expect("Arc::try_unwrap failed")
}

I'm still finding a way to address it.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-fs Module: tokio/fs labels Feb 7, 2024
@maminrayej
Copy link
Member

maminrayej commented Feb 7, 2024

I'm having a little bit of difficulty understanding the solution here. Here is my understanding:

A call to poll_write will copy the contents of the source buffer into its internal buffer. In background, tokio will spawn a task that will essentially write the contents of the internal buffer to the underlying std file. Finally it will return the number of bytes copied to its internal buffer.

The problem is the code that writes the buffer into the std file:

pub(crate) fn write_to<T: Write>(&mut self, wr: &mut T) -> io::Result<()> {
    assert_eq!(self.pos, 0);

    // `write_all` already ignores interrupts
    let res = wr.write_all(&self.buf);
    self.buf.clear();
    res
}

This function will clear the contents of the buffer regardless of whether write_all was successful or not. So when the call to write_all fails, users will observe the error on the next call to poll_*.

So the returned number of bytes is correct. The error returned from flush is also correct. But the combination of these two cannot help the user to recover from the error since it's unclear how many bytes were actually written to disk. I guess a handwritten implementation of write_all that returns the written number of bytes could help the situation since now flush can error out with the partially flushed bytes. But that requires a custom error and a down cast perhaps.

write_to_partial will return if the buffer is empty. But I don't see how the contents of the buffer is preserved in case of the write_all failure. Can you elaborate on how this PR fixes this issue please?

@Darksonn
Copy link
Contributor

Darksonn commented Feb 8, 2024

It's possible that we may have to say that recovering from errors with tokio::fs::File is out of scope. The exact correspondence between tokio::fs::File and the blocking file IO it performs on the background is complicated, and anyone who needs specialized control over the file IO should use std::fs from within spawn_blocking manually.

@Xuanwo
Copy link
Author

Xuanwo commented Feb 20, 2024

write_to_partial will return if the buffer is empty. But I don't see how the contents of the buffer is preserved in case of the write_all failure. Can you elaborate on how this PR fixes this issue please?

write_to_partial updates the internal pos and returns the correct write size upon success, but does nothing if an error occurs. The buffer is not changed or dropped, so it's safe for users to call write again.

I'm not entirely sure if this is the best solution. I'm open to exploring other options.

It's possible that we may have to say that recovering from errors with tokio::fs::File is out of scope.

I understand your point that handling recovery from errors with tokio::fs::File is beyond our scope. However, I believe the issue here revolves more around returning the correct error rather than obscuring it by discarding the internal buffer.

The example I provided in the issue relates to a disk being full, but this situation could arise with any error occurring during writing.

@Darksonn
Copy link
Contributor

For the most part, a tokio::fs::File should have similar behavior to a BufWriter<std::fs::File>. What does a BufWriter do on error?

@Xuanwo
Copy link
Author

Xuanwo commented Feb 20, 2024

For the most part, a tokio::fs::File should have similar behavior to a BufWriter<std::fs::File>. What does a BufWriter do on error?

BufWrtier will return error during flush:

let mut f = std::fs::OpenOptions::new()
    .create(true)
    .write(true)
    .open("/tmp/test_dir/test")
    .unwrap();

// Large than write size
let mut f = BufWriter::with_capacity(4 * 1024 *1024, f);

let size = 512 * 1024 + 1;
let mut bs = vec![0; size];
thread_rng().fill_bytes(&mut bs);

let n = f.write(&bs)?;
dbg!(&n)

let res = f.flush();
dbg!(&res);

let res = f.flush();
dbg!(&res);

Output:

[src/main.rs:61] &n = 524289
[src/main.rs:65] &res = Err(
    Os {
        code: 28,
        kind: StorageFull,
        message: "No space left on device",
    },
)
[src/main.rs:69] &res = Err(
    Os {
        code: 28,
        kind: StorageFull,
        message: "No space left on device",
    },
)

Internally, BufWrtier keep tracking the buffer size while calling flush. They has a BufGuard to track the write size.

https://doc.rust-lang.org/src/std/io/buffered/bufwriter.rs.html#180-224

@carllerche
Copy link
Member

I generally agree w/ @Darksonn's take. Tokio's behavior should more or less follow BufWriter semantics.

@carllerche
Copy link
Member

It sounds like the issue is that the buffer is cleared after write_all even if there is an error. It sounds like the solution is to only clear the amount of data that was written (or keep a cursor), then return the error from flush. The caller can rectify the issue, then call flush again and tokio can resume writing from the point at which the error happened.

I think that should fix the issue?

@Xuanwo
Copy link
Author

Xuanwo commented Feb 28, 2024

It sounds like the solution is to only clear the amount of data that was written (or keep a cursor), then return the error from flush. The caller can rectify the issue, then call flush again and tokio can resume writing from the point at which the error happened.

Yep. I believe this should work.

@Xuanwo
Copy link
Author

Xuanwo commented Mar 1, 2024

I think that should fix the issue?

There is another problem: Is it necessary to submit a new write task when the user calls flush if the last write resulted in an error?

@Darksonn
Copy link
Contributor

Darksonn commented Mar 6, 2024

So, what I think should happen is that when an operation fails, we don't throw away the buffer. Then, we retry the operation the next time that the user flushes.

Why do you need to add the Arc around the std file?

@Xuanwo
Copy link
Author

Xuanwo commented Mar 6, 2024

Then, we retry the operation the next time that the user flushes.

I need the std file to create a new write task. Current poll_flush can't be retried. Do you have some suggestions?

@Darksonn
Copy link
Contributor

Darksonn commented Mar 6, 2024

You should be able to add an Arc<File> argument to Inner::poll_flush.

@Xuanwo
Copy link
Author

Xuanwo commented Mar 6, 2024

You should be able to add an Arc<File> argument to Inner::poll_flush.

Thanks, I will give it a try.

@Xuanwo
Copy link
Author

Xuanwo commented Jun 7, 2024

I don't have interest in this anymore, closing...

Thanks you all for the review and advice.

@Xuanwo Xuanwo closed this Jun 7, 2024
@Xuanwo Xuanwo deleted the fix-storage-full branch June 7, 2024 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-fs Module: tokio/fs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fs: Data loss while retrying File::flush when disk is full
4 participants