-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTFS-2419 #782
base: master-v1.5.1
Are you sure you want to change the base?
BTFS-2419 #782
Conversation
Codecov Report
@@ Coverage Diff @@
## master #782 +/- ##
==========================================
- Coverage 11.62% 11.58% -0.04%
==========================================
Files 149 150 +1
Lines 13111 13317 +206
==========================================
+ Hits 1524 1543 +19
- Misses 11276 11466 +190
+ Partials 311 308 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没完全看懂,可能我有些理解不太对,当前我的问题是:
- S3 只能通过url来获取,这个文件是否一定存在s3上实际上我们不保证的,而且我也不太确认让所有客户都装s3的驱动是否合适
- 我没看到调用入口,好像你是把接受challenge question这个命令复用了,实际上就现在设计,这个是得由host自己发起请求来获取文件。
@@ -41,49 +41,12 @@ still store a piece of file (usually a shard) as agreed in storage contract.`, | |||
}, StorageChallengeResponseCmd.Arguments...), // append pass-through arguments | |||
RunTimeout: 20 * time.Second, | |||
Run: func(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) error { | |||
cfg, err := cmdenv.GetConfig(env) | |||
scr, err := ReqChallengeStorage(req, res, env, req.Arguments[0], req.Arguments[2], req.Arguments[3], req.Arguments[4], req.Arguments[5]) //(*StorageChallengeRes, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not fully sure for this part, my understanding is decentral challenge will start another command, it seems the command you are changing still need in the future. But I am not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is involved in the spin package, quite like heart beat, challengers request challenge jobs for every 60 mins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
还是没搞懂你为啥要改写这个命令,感觉上这个命令未来还是得用的,我们的router node还是得做challenge的,如果这个decentral challenge没成功的话。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,这个地方你每60分钟启动一次可能有问题;理论上这个challenge应该是完成工作以后就继续要下一个,60分钟最多check一下如果出了问题去trigger一下。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, it is ok for now.
return nil, fmt.Errorf("failed to reuqest challenge job for peer id: {%s}", n.Identity.Pretty()) | ||
} | ||
|
||
rawData, err := repo.DefaultS3Connection.RetrieveData(challengeResp.PackageUrl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go-BTFS client cannot use s3 connection to retrieve the file, it should use the http connection to get the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the code to use http request to get all the questions.
challengeJobResult := &guardpb.ChallengeJobResult{ | ||
NodePid: peerId, | ||
JobId: challengeResp.JobId, | ||
Result: challengeResults, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不太确认,好像你这个代码没有block等待的逻辑,所以你生成这个object的时候,challengeResults 应该是没有内容的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimized the code and control the number of go routine, and make sure all the challenge jobs have been done and then submit the challenge results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- To solve fileHash issue, I had update the file's schema, please review that PR and fix it accordingly.
- Still not fully understand some code, and doubt it will cause the issue.
@@ -41,49 +41,12 @@ still store a piece of file (usually a shard) as agreed in storage contract.`, | |||
}, StorageChallengeResponseCmd.Arguments...), // append pass-through arguments | |||
RunTimeout: 20 * time.Second, | |||
Run: func(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) error { | |||
cfg, err := cmdenv.GetConfig(env) | |||
scr, err := ReqChallengeStorage(req, res, env, req.Arguments[0], req.Arguments[2], req.Arguments[3], req.Arguments[4], req.Arguments[5]) //(*StorageChallengeRes, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
还是没搞懂你为啥要改写这个命令,感觉上这个命令未来还是得用的,我们的router node还是得做challenge的,如果这个decentral challenge没成功的话。
@@ -41,49 +41,12 @@ still store a piece of file (usually a shard) as agreed in storage contract.`, | |||
}, StorageChallengeResponseCmd.Arguments...), // append pass-through arguments | |||
RunTimeout: 20 * time.Second, | |||
Run: func(req *cmds.Request, res cmds.ResponseEmitter, env cmds.Environment) error { | |||
cfg, err := cmdenv.GetConfig(env) | |||
scr, err := ReqChallengeStorage(req, res, env, req.Arguments[0], req.Arguments[2], req.Arguments[3], req.Arguments[4], req.Arguments[5]) //(*StorageChallengeRes, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,这个地方你每60分钟启动一次可能有问题;理论上这个challenge应该是完成工作以后就继续要下一个,60分钟最多check一下如果出了问题去trigger一下。
resultChan := make(chan *guardpb.ShardChallengeResult, len(questions)) | ||
for _, question := range questions { | ||
requestChan <- struct{}{} | ||
go doChallenge(req, res, env, question, requestChan, resultChan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part start many go routine , if the questions count is high, it will exceeds the memory.
Btw, wg is not good to be defined as static variable, why just define before the for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the length of channel to control the number of go routine.
ok, wg can be local variable and pass to go method.
ShardHash: question.ShardHash, | ||
Nonce: question.Nonce, | ||
} | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly, not get understanding why start another go routine here, why just run the retry function directly and wait for its response, I noticed challengeHostBo already has timeout control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified the code here.
return nil, fmt.Errorf("receive wrong status code %d response: %s", resp.StatusCode, string(rawData)) | ||
} | ||
|
||
var questions []*Question |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some code change today for challenge question to add fileHash, so that you can use proto.decentralQuestions to unmarshall the file. And to reduce the bandwidth, I use gzip to zip the file. The PR is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, code changed to read gzip file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, code changed to read and unmarshal gzip file.
No description provided.