-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate how/if bbcp allows to select network paths for data transfers #21
Comments
Most of the last changes were fine, in the sense that they made sure the target file contained the IP address used to contact the NGAS server. However, the reversal of the flow was not necessary, as the default behavior of bbcp is to have the SRC contact the SINK, which is what we want. The addition of the -z flag was indeed mostly experimental, and it was later revealed that it didn't make a difference to the underlying problem (now described more thoroughly in #21). Signed-off-by: Rodrigo Tobar <[email protected]>
After reading some more code, I think I understand better what is going on, and how to go about it. The problem was indeed that bbcp tries very hard to use each node's hostname as the main bit of information for establishing the data channels. I just tried some changes to bbcp locally and I could get use the interface I wanted (I tried in my laptop contacting NGAS through the network interface, which started bbcp and had it exchange data through the same interface). I will now expose those changes but through a new option (so old behavior is left unchanged), and will try again. Stay tuned... |
I've implemented a new @gsleap could you give this a try when you have some time? Make sure you have the latest bbcp -j -f -V -n -S "ssh -x -a -oBatchMode=yes -oGSSAPIAuthentication=no -oFallBackToRsh=no %4 %I -l %U %H bbcp" -e -E c32c -s 12 -P 2 [email protected]:/data/20191210/rawdump_1260043216.raw 192.168.120.110:/home/mwa/NGAS/volume2/staging/NGAMS_TMP_FILE___3airu2emrawdump_1260043216.raw.fits Hopefully this will take us in the right direction. I experienced some slowness while the connection was actually being established between the SRC and SNK copies of bbcp, and while I didn't stop to find out what was causing it I'm hoping it's something more to do with my setup and environment, and the fact that both copies are in the same computer in my case, than with the actual changes I did to the software. |
Hi Rod,
Awesome, thanks for that. That did the trick- the bbcp was successful and without any real effort, achieved sustained speed of ~ 6 Gbps!
The only issue (well not issue, more of a nit pick) is that the final stats output shows:
Target 127.0.1.1 using a final recv window of 3137568
Source 127.0.1.1 using a final send window of 6256640
(using 127.0.1.1 which is ubuntu's own internal DNS ip, rather than the IP's involved in the transfer)
No big deal though.
Thanks again!
Greg
…________________________________
From: rtobar <[email protected]>
Sent: Wednesday, 12 February 2020 1:53 PM
To: ICRAR/ngas <[email protected]>
Cc: Greg Sleap <[email protected]>; Mention <[email protected]>
Subject: Re: [ICRAR/ngas] Investigate how/if bbcp allows to select network paths for data transfers (#21)
I've implemented a new -j command-line option in bbcp that should make it prefer using the hostname/IPs given in the file specifications in the command line instead of the hostnames of the nodes involved in the data exchange. Again, I tried this locally in my laptop by forcing bbcp to use my ethernet interface for the data exchange instead of the loopback interface, and it seems to work.
@gsleap<https://github.com/gsleap> could you give this a try when you have some time? Make sure you have the latest master version of bbcp from https://github.com/ICRAR/bbcp in both machines. Then go with the following on mwacache10 (has -j, but doesn't have -z):
bbcp -j -f -V -n -S "ssh -x -a -oBatchMode=yes -oGSSAPIAuthentication=no -oFallBackToRsh=no %4 %I -l %U %H bbcp" -e -E c32c -s 12 -P 2 [email protected]:/data/20191210/rawdump_1260043216.raw 192.168.120.110:/home/mwa/NGAS/volume2/staging/NGAMS_TMP_FILE___3airu2emrawdump_1260043216.raw.fits
Hopefully this will take us in the right direction. I experienced some slowness while the connection was actually being established between the SRC and SNK copies of bbcp, and while I didn't stop to find out what was causing it I'm hoping it's something more to do with my setup and environment, and the fact that both copies are in the same computer in my case, than with the actual changes I did to the software.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#21?email_source=notifications&email_token=AE2L5FU7H5VUAYWNDHFFZ33RCOFGHA5CNFSM4KO6V2TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELPQWGQ#issuecomment-585042714>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE2L5FTFIQSY6F7ZA4IZMJTRCOFGHANCNFSM4KO6V2TA>.
|
That's great news! I'll then put some effort into getting these changes upstreamed to the original bbcp maintainer, and to make the corresponding changes in NGAS to ensure we use the |
During the work done in #19 it was found by @gsleap that no matter the combination of parameters given to
bbcp
, it apparently always resorted to using hostnames (and fully qualified domain names) as the sole basis for establishing the connection between the source and the target nodes. This is not enough for certain scenarios, where two machines have multiple, independent networks paths that can be taken depending on the interface being addressed.Consider the following scenario, which is similar to the deployment used in the tests described in #19:
The NGAS server running in
B
is listening oneth1
, the 10 Gb interface. When theBBCPARC
command comes in fromA
we generate and execute this command inB
:bbcp .... 2.2.2.2:/path/to/source/file 2.2.2.3:/path/to/ngas/staging/file
By using the specific IPs in the source and target specifications we expect
bbcp
to explicitly use the 10 Gb interface for the data transfer.The command starts the
SRC
andSINK
copies ofbbcp
inA
andB
respectively. bbcp however seems to use exclusively the hosts' names as the main bit of information to establish the connection betweenSRC
andSINK
, and becauseA
andB
resolve to the1.1.1.X
addresses, the 1Gb link is used for the bbcp data transfers. This behavior seems to be same regardless of the direction of the connection establishment (i.e. the-z
option) and whether name resolution (i.e., the-n
option) is used, but this should be tested thoroughly.This problem was initially investigated in #19, but then it was decoupled into a new issue to separate it from the original problem reportedin #19, which has been fixed.
The text was updated successfully, but these errors were encountered: