-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-core simulation encountered 192cores bottleneck #157
Comments
Can you instead try the hello_world_token.c test that I think should be released with metro-mpi? There's a software bottleneck in the test itself which that test should help with in place of _many.c |
Ok, I will try it, thank you very much! |
1 similar comment
Ok, I will try it, thank you very much! |
I tried to use hello_world_token.c instead of hello_world_many.c to test, can't see any print in the fake_uart.log too. And I found all of the trace_hart_*.log files are empty, it represents no any communication in the test. Is there any data width requirement for above 192 cores? After a while, I can see the follow information in the trace_hart_5.log |
I am quite intrigued by the results you are getting. I have experienced similar problems in the past. |
Hi @guillemlp Reply to you as below: You are using the commit of metro_mpi but you are not simulating w metro_mpi right? Can you point me which hello_world.c/hello_world_many.c are you using? Thanks! |
I retried the test with hello_world_token.c, the phenomenon as the tested with hello_world_many.c |
can you verify if argv variable in main is char or int? (should be int if you are using more than 64 cores) |
Hi @guillemlp: can you verify if argv variable in main is char or int? (should be int if you are using more than 64 cores)
//char num[2] = {cid, nc}; ATOMIC_OP(finish_sync0, 1, add, w); // synchronize for debug output below char buf[NUM_COUNTERS * 32] attribute((aligned(64))); ATOMIC_OP(finish_sync1, 1, add, w); exit(ret);
// synchronization variable // synchronize with other cores and wait until it is this core's turn // assemble number and print // increment atomic counter return 0; have you tried 128 cores doing the hello world token correctly? Further more, I have tried 192 cores to test with hello_world_token.c, it is working well too. And have tried 208 cores, it is not work, it looks the fetched instruction is incorrect. Thank! |
Which NoC sizes are you playing with? |
Hi @guillemlp: I have tried 16*16 cores configuration, it doesn't work. My steps as below:
after a while, I can see the information in the trace_hart_*.log as below: It is a little bit difficult to find the rootcause. I will git clone MPI project to do the test(256 cores or above). I think I might have lost something. Thanks! |
Hi @guillemlp: I have done steps as below on the MPI project:
But I can see the information in the trace_hart_*.log as below: I don't know what wrong I did. Could you help to check for me? Thanks! |
Hi experts:
I git the openpiton_dev branch, and changed the code reference the second last Metro-MPI commit (https://github.com/metro-mpi/metro-mpi/commits/metro-mpi/ commit 264b365).
I use "sims -sys=manycore -x_tiles=16 -y_tiles=12 -msm_build -ariane" generated 192 cores(or below 192 cores xy-tiles configuration), use "sims -sys=manycore -msm_run -x_tiles=4 -y_tiles=4 hello_world_many.c -ariane -finish_mask 0x1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 -rtl_timeout 1000000000000" simulated and I can see
Hello world, this is hart 0 of 16 harts!
Hello world, this is hart 1 of 16 harts!
Hello world, this is hart 2 of 16 harts!
Hello world, this is hart 3 of 16 harts!
Hello world, this is hart 4 of 16 harts!
Hello world, this is hart 5 of 16 harts!
Hello world, this is hart 6 of 16 harts!
Hello world, this is hart 7 of 16 harts!
Hello world, this is hart 8 of 16 harts!
Hello world, this is hart 9 of 16 harts!
Hello world, this is hart 10 of 16 harts!
Hello world, this is hart 11 of 16 harts!
Hello world, this is hart 12 of 16 harts!
Hello world, this is hart 13 of 16 harts!
Hello world, this is hart 14 of 16 harts!
Hello world, this is hart 15 of 16 harts!
information in the fake_uart.log
I use "sims -sys=manycore -x_tiles=16 -y_tiles=13 -msm_build -ariane" generated 208 cores(or above 192 cores xy-tiles configuration), use "sims -sys=manycore -msm_run -x_tiles=4 -y_tiles=4 hello_world_many.c -ariane -finish_mask 0x1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 -rtl_timeout 1000000000000" simulated and waited a long time(above 12 hours), but I can't see any print in the fake_uart.log
Is there other limitation for above 192 cores?
Thanks!
The text was updated successfully, but these errors were encountered: