-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in artifact creation of the LX6 instructions #453
Conversation
1ee2715
to
77b0747
Compare
77b0747
to
79cc064
Compare
Is this the kernel lag issue we've all been facing? I'm wondering how I was seeing the seeing the same thing even though I wasn't reusing the same buffer/vector in my own path (in xaiepy). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find!
In the future we could add a test that there is exactly 1xclbin generated for a test like this, I can try and add that at some stage.
@@ -359,6 +358,8 @@ LogicalResult AIETargetBackend::serializeExecutable( | |||
|
|||
std::ifstream instrFile(static_cast<std::string>(npuInstPath)); | |||
std::string line; | |||
// Vector to store LX6 instructions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I've heard that AIE2p has LX7 instructions (not a request for change!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's true that forthcoming gens will have LX7 (and higher?) but the instructions themselves don't actually have anything to do with the LX arch (which is Tensilica's naming scheme for their archs and not connected to our firmware that runs these instructions). But you're right the comment will be out of date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have a clean up PR for this, I noticed you did LX[0-9] in some places, maybe we should say LX6/LX7 instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the safest term would be command processor
but who cares (honestly when I first started it was useful knowing that it was LX6 because there's a datasheet type thing out there for that so it clarified what people were really talking about).
I think that might be a different issue, this one is only hit when we have multiple dispatches in one executable in iree xrt driver. |
Yes that would be useful, there is one challenge though, we will create multiple XCLBINs as we create them first and then merge it into the previous one, so the old xclbin artifact is still present even though we wont use it at runtime. Btw jut to be clear, this test currently will produce three xclbins and use all of them at runtime, we need something like #418 for it to use 1 xclbin at runtime. As I said previously it will still produced three XCLBIN's at compile time. |
There was a bug where were appending LX6 instructions to an existing vector for subsequent entry points rather than making a new vector for each entry point. This change fixes that and hence fixes #447 which was indeed a kernel time out due to bad artifacts.
Also adds a multi-dispatch e2e test to CI.