-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xdma-snax integration #199
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suuuupper good work @IveanEx !
I bet you work so hard on this. I like that we have this interesting xDMA already and it's a good work in progress!
I have things that need to be critical about though:
- The PR is huge haha! It's too big honestly and very hard to read through carefully. Remember, since SNAX is something that is shared amongst our team, it is best to make "byte-sized" (get the pun? 😆 ) PRs. I will strongly recommend to keep it to less than a thousand lines. What are ways to make it small? Separate the hardware build and the software tests in separate PRs! That would be a great fix already!
- I am a bit iffy about the differences between the streamers and the xDMA. They have the same units like the AGUs and data movers. Is there no way that they could be re-used or placed in the common cells section? You have a CommonCells directory inside the xDMA but then there are also re-usable cells in others. Maybe better to have all common cells at some directory that can be used through all! It now feels like we are not embracing the "reusability" of these modules.
- I guess it's for testing purposes but are we really making the xDMA like a separate accelerator? I was thinking it should be inside the cluster too! My worry with this setup is that there will be a HUGE clean-up afterwards. Which I fear may be prone to more work 😅
There are nice things I like though! Like the smart case-statements for the reading and writing for the CSRs. I hope the compiler is smart enough just to route the values! I would really love to have a separate PR actually fix that 😄
GOOD JOB!
I won't approve yet as we need to discuss more. The next step after this is to create documentation for this! Don't under estimate it. Lack of sufficient documentation is a problem in open source stuff since we're always rushing to implement things but never taking the time to make it easier for other people to understand...
I also suggest to ask @xiaoling-yi's opinion here since there is much similarity in your modules.
hw/chisel/src/main/scala/snax/xdma/xdmaExtension/Transposer.scala
Outdated
Show resolved
Hide resolved
hw/chisel/src/main/scala/snax/xdma/xdmaStreamer/AddressGenUnit.scala
Outdated
Show resolved
Hide resolved
target/snitch_cluster/sw/apps/snax-xdma-memset/src/snax-xdma-memset.c
Outdated
Show resolved
Hide resolved
gen_chisel_file( | ||
chisel_path=args.chisel_path, | ||
chisel_param="snax.xdma.xdmaTop.xdmaTopGen", | ||
gen_path=" --clusterName " + str(cfg["cluster"]["name"]) + | ||
" --tcdmDataWidth " + str(cfg["cluster"]["data_width"]) + | ||
" --axiDataWidth " + str(cfg["cluster"]["dma_data_width"]) + | ||
" --addressWidth " + str(cfg["cluster"]["addr_width"]) + | ||
" --tcdmSize " + str(cfg["cluster"]["tcdm"]["size"]) + | ||
" --readerDimension " + str(snax_xdma_cfg["reader_agu_dimension"]) + | ||
" --writerDimension " + str(snax_xdma_cfg["writer_agu_dimension"]) + | ||
" --readerBufferDepth " + str(snax_xdma_cfg["reader_buffer"]) + | ||
" --writerBufferDepth " + str(snax_xdma_cfg["writer_buffer"]) + | ||
(" --HasMemset " if snax_xdma_cfg["has_memset"] else "") + | ||
(" --HasMaxPool " if snax_xdma_cfg["has_maxpool"] else "") + | ||
(" --HasTransposer " if snax_xdma_cfg["has_transposer"] else "") + | ||
" --target-dir " + args.gen_path + | ||
cfg["cluster"]["name"] + "_xdma/" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is smart too because rather than having a scala parameter generated, it can be this one. However, this implies it is very custom to the xDMA only. Not something that is quite "generalized" like the accelerators. I guess it is fine for now since the intent is to have the xDMA a SNAX thing. But what about the different extensions? That implies that people have to (1) change this wrapper gen to add another extension, (2) modify the scala files for the extension, and (3) add the configurations too.
The SNAX accelerators just need the configuration files since they are "generalized" already and do not need too many modifications. Even the accelerator wrapper names are all centralized in the configuration files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. it needs to be optimized. It should be done in a seperate PR, I think it is better to rewrite gen_chisel_file function.
I'm not completely in the loop but I agree with this sentiment! There are starting to be a lot of variations on the system with the gemm accelerator, and this would add an extra one. Can we not include this xDMA in every snax cluster from now on and replace the old DMA entirely? Or are there still some disadvantages? There would just need to be a new definition of |
…ion of pref_snax_count at the generation of xdma
It cannot replace the original idma at present, as it does not have the function to move the data from / to DRAM yet. Yes, like any other accelerator, it can be disabled, and which extension should be included is completely configurable. I also agree that now there are too many versions in snax... I plan to clean the cfg file and merge many things with @xiaoling-yi's contribution. But should happen when she come back 😊😊 |
@xiaoling-yi we will push this since it's a WIP that is being cut into multiple PRs anyway. So kindly take a look when you have the time. If you see something, tell @IveanEx ! Thankkkss~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!!! ❤️ ❤️ ❤️ ❤️ ❤️
Can you make the issue list of things to fix? Add the - [ ]
check boxes 😄 Thankkkss!
Is it the plan to do this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessary anymore. All ci use this chiselgen to generate the code, so it is naturally tested.
Furthermore, this is the source of error that reandomly fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay!
Yes, it will be done in the future. But more discussion should be made on improvement / change of AXI bus. |
This PR enables the availability of the xdma on snax cluster, which will be the next generation distributed DMA focusing on intra-cluster data movement, inter-cluster data movement, and data movement between any cluster and global memory.
The function of it includes:
At present, only the intra-cluster capacity is enabled. Three extensions are designed as the reference of xdma datapath extension, including:
The runtime is also provided to control xdma by standard snax CSR interface, with C headers automatically generated with the generation of hardware.