Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

add-go-fuse-to-inject-filesystem-error #583

Open
wants to merge 84 commits into
base: master
Choose a base branch
from

Conversation

qiffang
Copy link

@qiffang qiffang commented Apr 16, 2019

BackGround
Base the issue - Tests with injected file system errors. #579
We can use go-fuse to inject file system, but it is too complex for the user.

In this pull request, I provide a HOOKFS way.
Function
You can implement Hookxxx interface to implement your purposes(For now, only provide HookOnRename interface, if you think it is ok, i will implement other hooks, for example HookOnOpenDir and so on)

type HookOnRename interface {
	PreRename(oldPatgh string, newPath string) (hooked bool, err error)
	PostRename(oldPatgh string, newPath string) (hooked bool, err error)
}

Then Inject hook implement to HookServer

NewHookFs("$orignal", "$mountpoint", &TestRenameHook{})

Example

New HookServer
....
//normal logic
os.Create("$mountpoint/tsdb.txt")
//The rename action in mountpoint path will call Your hook to inject file system errors
err := os.Rename("$mountpoint/tsdb.txt", "$mountpoint/tsdbNew.txt")

QA
Add test CreateBlock logic and OpenBlock logic with new way

case1:
//1.init hook server, rename files throw IO exception

//2.normal logic
createBlock(t, mountpoint, genSeries(1, 1, 200, 300))
dir, _ := ioutil.ReadDir(mountpoint)
testutil.Equals(t, 0, len(dir))
case 2:
 //1.create block, this action will be successful because hook server does not start
path := createBlock(t, original, genSeries(1, 1, 200, 300))

//2.init hook server, rename files throw io exception

//3.call OpenBlock method
OpenBlock(nil, filepath.Join(mountpoint, file), nil)
dir, _  := ioutil.ReadDir(filepath.Join(mountpoint, file))
testutil.Equals(t, true, len(dir) > 0)
hasTempFile := false
for _, info := range dir {
	if strings.HasSuffix(info.Name(), "tmp") {
		hasTempFile = true
		break
	}
}

testutil.Equals(t, true, hasTempFile)

After all test cases gone, please call "fusermount -u $mountpoint"
There is a complete example in hook_test.go file.
If you think this way is ok, I will implement more hooks.
Thanks

@krasi-georgiev
Copy link
Contributor

that looks really cool, can you try to implement one full test.

@qiffang
Copy link
Author

qiffang commented Apr 16, 2019

that looks really cool, can you try to implement one full test.

Thanks for your accept, can you give some advice about a rename relation test case?
Thanks

@qiffang qiffang force-pushed the add-go-fuse-to-inject-filesystem-error branch from 9d9c978 to c1334fe Compare April 16, 2019 08:30
@krasi-georgiev
Copy link
Contributor

the most simple one I can think of is something like:

blockDir := createBlock(t, tmpdir, genSeries(1, 1, 0, 0))
b, err := OpenBlock(nil, blockDir, nil)

// check that blockDir doesn't have any tmp files.

OpenBlock uses writeMetaFile which creates a tmp file and than renames it with the new file so if the rename fails it should leave some tmp files behind.

@qiffang qiffang force-pushed the add-go-fuse-to-inject-filesystem-error branch from c1334fe to a19b6c9 Compare April 16, 2019 09:55
@krasi-georgiev krasi-georgiev mentioned this pull request Apr 16, 2019
@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Apr 16, 2019

do you think this will also allow injecting an error on Write? The other bug in the issue I opened is when Write to a file fails due to no disk space #582

@qiffang
Copy link
Author

qiffang commented Apr 16, 2019

the most simple one I can think of is something like:

blockDir := createBlock(t, tmpdir, genSeries(1, 1, 0, 0))
b, err := OpenBlock(nil, blockDir, nil)

// check that blockDir doesn't have any tmp files.

OpenBlock uses writeMetaFile which creates a tmp file and than renames it with the new file so if the rename fails it should leave some tmp files behind.

I have added two test cases in compact_test.go files, please help to check it.
Thanks

@qiffang
Copy link
Author

qiffang commented Apr 16, 2019

do you think this will also allow injecting an error on Write? The other bug in the issue I opened is when Write to a file fails due to no disk space #582

Of course, almost filesystem's error can be injected by this way and do not change any real code.

@qiffang qiffang force-pushed the add-go-fuse-to-inject-filesystem-error branch from 791d089 to 2472c78 Compare April 17, 2019 01:56
@qiffang qiffang force-pushed the add-go-fuse-to-inject-filesystem-error branch from b882862 to ec0a5e0 Compare April 17, 2019 02:33
@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Apr 17, 2019

would you mind fixing the travis tests. Looks like some dependency issues so should be an easy fix.

@qiffang
Copy link
Author

qiffang commented Apr 18, 2019

would you mind fixing the travis tests. Looks like some dependency issues so should be an easy fix.

I have uploaded go.mod(Previously, i do not known tsdb do this check)
Currently, the problem is that there is a permission for using fuse

/bin/fusermount: failed to open /etc/fuse.conf: Permission denied

In my environment, it works well

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Apr 18, 2019

I guess you need to check with the travis team for the proper way to use fuse with travis. They might have some support IRC channel or maybe open an issue on github.

@brian-brazil
Copy link
Contributor

Fuse is a pretty big requirement for a unittest, that's not going to be runnable in many places.

@qiffang
Copy link
Author

qiffang commented Apr 18, 2019

Fuse is a pretty big requirement for a unittest, that's not going to be runnable in many places.

Thanks for your response, I plan to use it for injecting various file system error in testcases.
Do you think it is ok?
Thanks

@krasi-georgiev
Copy link
Contributor

before suggesting it as a solution I did check and some travis posts suggest that it should be supported in travis, but of course if there is any other alternative happy to use something easyer.

@simonpasquier @SuperQ maybe you can give some ideas?

testutil/fuse/fuse.go Outdated Show resolved Hide resolved
testutil/fuse/fuse.go Outdated Show resolved Hide resolved
testutil/fuse/fuse.go Outdated Show resolved Hide resolved
compact_test.go Outdated Show resolved Hide resolved
@qiffang
Copy link
Author

qiffang commented Jun 20, 2019

Yes it works now.

I have refactored the test and even found a bug which I will fix in a seprate PR.

// TestFailedDelete ensures that the block is in its original state when a delete fails.
func TestFailedDelete(t *testing.T) {
	original, err := ioutil.TempDir("", "original")
	testutil.Ok(t, err)
	mountpoint, err := ioutil.TempDir("", "mountpoint")
	testutil.Ok(t, err)

	defer func(){
		testutil.Ok(t, os.RemoveAll(mountpoint))
		testutil.Ok(t, os.RemoveAll(original))
	}()

	_, file := filepath.Split(createBlock(t, original, genSeries(1, 1, 1, 100)))
	clean := fuse.NewServer(t, original, mountpoint, fuse.FailingRenameHook{})
	defer clean()

	expHash, err := testutil.DirHash(original)
	testutil.Ok(t, err)
	fmt.Println("-------")
	pb, err := OpenBlock(nil, filepath.Join(mountpoint, file), nil)
	testutil.Ok(t, err)
	defer func() {
		testutil.Ok(t, pb.Close())
	}()

	testutil.NotOk(t, pb.Delete(1, 10, labels.NewMustRegexpMatcher("", ".*")))

	actHash, err := testutil.DirHash(original)
	testutil.Ok(t, err)

	testutil.Equals(t, expHash, actHash, "the block dir hash has changed after a failed delete")

}

../testutil/directory.go

// DirHash returns a hash of all files attributes and their content within a directory.
func DirHash(path string) ([]byte, error) {
	hash := md5.New()
	err := filepath.Walk(path, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			return err
		}
		if !info.IsDir() {
			f, err := os.Open(path)
			if err != nil {
				return err
			}
			defer f.Close()

			if _, err := io.Copy(hash, f); err != nil {
				return err
			}

			if _, err := io.WriteString(hash, strconv.Itoa(int(info.Size()))); err != nil {
				return err
			}
			if _, err := io.WriteString(hash, info.Name()); err != nil {
				return err
			}
			modTime, err := info.ModTime().GobEncode()
			if err != nil {
				return err
			}
			if _, err := io.WriteString(hash, string(modTime)); err != nil {
				return err
			}

		}
		return err
	})
	return hash.Sum(nil), err
}

This case fail for now, so i have to wait #636?

@krasi-georgiev
Copy link
Contributor

This case fail for now, so i have to wait #636?

yes

@qiffang
Copy link
Author

qiffang commented Jun 20, 2019

This case fail for now, so i have to wait #636?

yes

Got it and do you have any suggestions for this pull request?

compact_test.go Outdated Show resolved Hide resolved
testutil/testutil.go Outdated Show resolved Hide resolved
testutil/fuse/hook.go Outdated Show resolved Hide resolved
testutil/fuse/hook.go Outdated Show resolved Hide resolved
testutil/fuse/hook.go Outdated Show resolved Hide resolved
testutil/fuse/fuse_windows.go Show resolved Hide resolved
testutil/fuse/fuse_windows.go Outdated Show resolved Hide resolved
testutil/fuse/fuse_windows.go Outdated Show resolved Hide resolved
testutil/fuse/fuse_windows.go Outdated Show resolved Hide resolved
testutil/fuse/fuse_windows.go Outdated Show resolved Hide resolved
compact_test.go Outdated Show resolved Hide resolved
func (s *Server) forceMount() (err error) {
delay := time.Duration(0)
for try := 0; try < 5; try++ {
err = syscall.Unmount(s.mountpoint, flag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no problem to call unmountFlag() few times. This is no requirement for high performance here and I think the compiler would be smart enough to translate this into a single call.

testutil/fuse/fuse_windows.go Outdated Show resolved Hide resolved
block_test.go Outdated Show resolved Hide resolved
@krasi-georgiev
Copy link
Contributor

#636 is merged so you can merge it here to pass the test.

@qiffang
Copy link
Author

qiffang commented Jun 25, 2019

There is no problem to call unmountFlag() few times. This is no requirement for high performance here and I think the compiler would be smart enough to translate this into a single call.

@krasi-georgiev , I do not think golang compiler can translate this into a single call.
For golang, it does not have JIT(Java has it, so java complier can translate this into a single call), golang just put it as inline function. I think it should be fast depend on cpu branch prediction.

As you said, it is no requirement for high performance here and i will remove the flag variables.

Copy link
Contributor

@krasi-georgiev krasi-georgiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with these last few nits.

testutil/fuse/fuse.go Outdated Show resolved Hide resolved
testutil/fuse/hook.go Outdated Show resolved Hide resolved
@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jun 25, 2019

@codesome , @bwplotka this is ready for your review.

long story short is that many bugs are caused by some file system errors and with this PR we will be able to write tests that inject fs errors and ensure that tsdb database state is left in a recoverable state. Since this uses fuse the tests are optional and meant to run mainly on the CI and targeting linux.

it already caught a bug which I fixed in: #636

Next good test would be to ensure that tsdb recovers nicely when running out of space . like the bug I fixed in #582

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jun 27, 2019

@qiffang while waiting for a review do you want to write a test for #582.
It is fine if you want to make this here or in a separate PR after/if this one is merged.

Something like -

  1. start creating WAL records
  2. after few records inject an error the the file system is full.
  3. check that the appender.Add fails,
  4. Remove the file system is full injection error.
  5. check that appender.Add succeeds.

@qiffang
Copy link
Author

qiffang commented Jun 28, 2019

@qiffang while waiting for a review do you want to write a test for #582.
It is fine if you want to make this here or in a separate PR after/if this one is merged.

Something like -

  1. start creating WAL records
  2. after few records inject an error the the file system is full.
  3. check that the appender.Add fails,
  4. Remove the file system is full injection error.
  5. check that appender.Add succeeds.

Hi @krasi-georgiev , I think it is better to do it in a separate PR.
For now, i will write it in my local environment and i will submit it after this PR is merged.
Thanks.

@krasi-georgiev
Copy link
Contributor

ping @bwplotka @codesome for a review.

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jul 25, 2019

@qiffang failing tests.

@bwplotka , @codesome , @gouthamve would you mind having a look at this PR.

@@ -316,7 +318,7 @@ func TestFailedDelete(t *testing.T) {
testutil.Ok(t, server.Close())
}()

expHash, err := testutil.DirHash(original)
expHash, err := dirHash(original)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use testutil.DirHash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @qiffang

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants