RTL MVAU: input weight stream when using internal_embedded mode #1120
Unanswered
sdittmeier
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi, the RTL/DSP MVAU doesn't support internal_embedded and always relies on a weight stream input. We should add an assertion to catch this as early as possible (@auphelia). You probably see the weight stream being tied off to zero because there is no memstreamer attached or because it is not used during simulation. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I'm currently trying to translate a sparse, quantized MLP with FINN.
First up, I'm super happy about the addition of the MVAU in RTL --> earlier, when trying to translate this MLP using HLS, it took > 1 month (because of its high hidden dimensionality, and it's > 98% sparsity could not be exploited by HLS until translating it into RTL)
Now, the step_hw_ipgen finishes within 2 hours (because of 1 remaining HLS layer)
Here is a screenshot of the layers, when generating resource estimates. I had to tweak FINN a bit that the first layer is created as an HLS MVAU, because of the large input data bit width (12 bit). Weights and activations are quantized to either 6 or 4 bits depending on the layer.
Now in order to make use of the sparsity, I set the mem_mode in all layers to internal_embedded.
The MVAU RTL layers still have an input weight stream, though all inputs are set to 0 for this, which matches my assumption, since the weights should be embedded in the MVAU.
When getting to the build step 'step_set_fifo_depths', the build flow crashes, because of the weight stream:
In the file finn_design_wrapper.v, created in the vivado_stitch_proj, the signal:
.weights_V_TDATA({1'b0,1'b0,1'b0,1'b0,1'b0,1'b0,1'b0,1'b0,...})
comes in this very unhandy style --> which leads to the crash in my build data flow, since this line is now > 5.000.000 columns long --> all filled with 0's.I get the following output in my build_dataflow.log
I know that my use case is probably quite far from what is intended. Perhaps there are better ways to implement a sparse network.
I would try again to tweak my local FINN version to shrink the size of the stream to 0 when using internal_embedded mode for the RTL MVAU. I'd be happy for any feedback or suggestions.
Beta Was this translation helpful? Give feedback.
All reactions