-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Architecture updater (auto-sync) - Updating ARM #1949
Conversation
If you get build errors for Blake3
then apply this work around: diff --git a/llvm/lib/Support/BLAKE3/CMakeLists.txt b/llvm/lib/Support/BLAKE3/CMakeLists.txt
index e5d227b0c486..f795bbabe702 100644
--- a/llvm/lib/Support/BLAKE3/CMakeLists.txt
+++ b/llvm/lib/Support/BLAKE3/CMakeLists.txt
@@ -10,6 +10,7 @@ if (LLVM_DISABLE_ASSEMBLY_FILES)
else()
set(CAN_USE_ASSEMBLER TRUE)
endif()
+set(CAN_USE_ASSEMBLER FALSE)
macro(disable_blake3_x86_simd)
add_definitions(-DBLAKE3_NO_AVX512 -DBLAKE3_NO_AVX2 -DBLAKE3_NO_SSE41 -DBLAKE3_NO_SSE2) |
From now on you will find an overview, of the tasks left, in the PR description. |
Never mind. I will push them once |
6e082c6
to
7fa0c46
Compare
Additional architectures that are supported by LLVM already but not yet by Capstone, which could be added once this auto-sync feature is finished, are:
|
Thanks for pointing it out :) Will fix this.
Have you rebased the tblgen_capstone_backends branch onto TriDis repo? Or did you rebase TriDis branch onto the |
Neither. In fact, I just used https://github.com/Rot127/llvm-capstone/tree/tblgen_capstone_backends |
I see. So you copied the TriCore |
https://github.com/imbillow/capstone/tree/14a3f56c2eb829b6a9d93ccfc2693e456e1d3fb1 Use this
|
@kabeor I assume it is fine to add the MC regression tests to the CI (running |
Of course yes! |
GitHub removed Ubuntu 18 images, if you want to continue supporting it, these should be converted to use Docker instead: actions/runner-images#6002 Moreover, it makes sense to add support for Ubuntu 22.x series in CI, better in a separate PR, then rebase once it is merged. EDITED: Nevermind, it was already done, just rebase is required: #1986 |
This comment was marked as resolved.
This comment was marked as resolved.
A gentle ping. |
@aquynh is still reviewing. He is really busy these days and only has time to do it at night. So sorry for the delay.😅 |
Ok, thats fine and understandable. Complete silence is what worried me. |
So, any feedback yet? |
PING! |
@kabeor @aquynh[https://github.com/aquynh]
Any chance we will proceed here in the next two days?
16 Jul 2023 16:53:19 Anton Kochkov ***@***.***>:
…
PING!
—
Reply to this email directly, view it on GitHub[#1949 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AK5ET6EE5XHYZ2Y2KS3HB33XQP553ANCNFSM6AAAAAATYLJW64].
You are receiving this because you were mentioned.[Tracking image][https://github.com/notifications/beacon/AK5ET6C75S4FZDBQ2NIMBADXQP553A5CNFSM6AAAAAATYLJW66WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTBSRN54.gif]
|
Finally we are ready to merge now. @Rot127 Just a quick check, "Rename files translated from LLVM (add LLVM to name)." item isn't selected. Any commit needs? |
Very nice to hear that!
Please ignore this "Rename files" point. At this point it just creates too much struggle. Merging is more important now.
19 Jul 2023 03:43:12 Wu ChenXu ***@***.***>:
…
Finally we are ready to merge now.
@Rot127[https://github.com/Rot127] Just a quick check, "Rename files translated from LLVM (add LLVM to name)." item isn't selected. Any commit needs?
—
Reply to this email directly, view it on GitHub[#1949 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AK5ET6CQQRBHXO3FI7JOAJTXQ43S5ANCNFSM6AAAAAATYLJW64].
You are receiving this because you were mentioned.[Tracking image][https://github.com/notifications/beacon/AK5ET6BH2UDXYFFYOKEAJGTXQ43S5A5CNFSM6AAAAAATYLJW66WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTB2OR72.gif]
|
Great, so thanks again for this excellent work. Merged! |
Architecture Updater
This is the very early version of the new architecture updater for Capstone.
It is meant to replace (#1831 and #1803).
Because this updater makes a few refactor tasks on Capstone necessary, it is opened as draft PR.
Feedback is very much welcome!
Please note that
x86
will not be supported here. Due to the vast difference in thex86
LLVM TableGen backends it would be too much work.This PR will only update
ARM
. OnceARM
is done,PPC
and afterwards other archs will follow in separated PRs.ARM
andPPC
come first because they have the most corner cases.Updater purpose
Updating an architecture module requires a lot of manual work. This is because (most) Capstone modules use the LLVM
Disassembler
andAsmPrinter
which are written in C++.Since Capstone is written in C the updating process requires a lot of manual work to translate the files.
The consequence is that instructions, introduced by new LLVM releases, are hard to add.
The purpose of this updater is to automate as much work as possible.
The target is that a developer only has to run the updater and fix a few build errors afterwards (which were too complex to resolve automatically).
Additionally, this will refactor a few things in Capstone.
Because of some changes in the LLVM TableGen backends (see below) we can generate easily more information than before.
A simple example for this are the instruction enums in
include/capstone/<ARCH>.h
.But there is more, like information about input and output operands, operand sizes etc.
Also, because all information will be generated homogeneously we can generalize some Capstone logic between architecture modules.
How it works
Please refer to
suite/auto-sync/README.md
for an introduction. Everything should be explained there.If you still have questions afterwards, please let me know here. This way I can enhance the documentation.
Please note that this PR currently does not include the new versions of the generated files. You can generate them by following the instructions in theREADME
.Generated files are added now.
The LLVM TableGen backends - The most important part
The new updater is only possible in this way because of some larger refactor tasks in some LLVM TableGen backends.
The newly refactored backends allow to emit the same code but in other languages then C++ (in our case we want them to emit C code).
Please see this LLVM review for details.
Testing
The updater is tested via Rizin. Checkout rizinorg/rizin#3399 and the CI test results of it.
TODO overview
Changes to core
Translation
Add patches for:
Fix:
Generated info
Update script
DisassemblerExtensions
,BaseInfo.h
) by hand.Other
test_arm
tocstest
issue file.arm.h
and translated header files (likeARMBaseInfo.h
).LLVM
to name).CAPSTONE_DIET
guards./cstool -d -s arm "80 b4 00 af 0a 4b d3 ed 00 7a b7 ee e7 6a 9f ed 06 5b 85 ee 06 7b f7 ee c7 7b b0 ee 67 0a bd 46 5d f8 04 7b 70 47"
./cstool -d armbe 0xCD000B00
llvm-objdump
first.ARM_reg_access()
TODO after merge
Open issue about
MCInst.writeback
and introducingConstraints
instead (WithAArch64
update). The constraints can be checked if they match awriteback
condition (see: ARM writeback on post-index #1507).UpdateFlags = True
: update_flags not working as expected for some ARM instructions #1568tLDMIA
not tot2LDMIA_UDP
)getRegName()
which returns a register alias if one is present (fp
etc.). LLVM only returns the raw reg name.set_mem_access
in ARMsinc
files. The neon lanes can be checked without it.cstool -d
shows the wrong id name (but the correct numerical value).ARM_set_detail_op_sysreg
.llvm-objdump
. Extendsuite/MC/update.py
with https://github.com/imbillow/capstone/blob/tricore/suite/gencstest.py0xb4,0xec,0x04,0x85 = ldc p5, c8, [r4], #0x10
decodes differently if CDE coprocs are present.) and DFB#0
in0x40,0x1b,0xf5,0xee == vcmp.f64 d17, #0
(arm). As well as MSR instructions (only print the serach forMRS
in AsmPrinter.inc).CPSR
access set. Fix in tablegen.INT
,RET
).Patches
of the translator. Add a method to each patch to get test cases and use the patcch on it.Update-Arch.sh
should be a Python script. Compatibility with Windows is a nightmare otherwise.Issues fixed
Fixed at the current state
closes #1985
closes #1946
closes #1897
closes #1784
closes #1855
closes #1935
closes #1951
closes #1983
closes #1783
closes #1587
closes #1197
closes #1020
closes #994
closes #1072 (Hand made groups are no longer supported. Only the ones defined by LLVM.)
closes #1601
closes #1228
closes #1196
closes #1195
closes #1724
closes #1568
Possibly closed
Need to be checked after this PR is ready.
#993 (needs tblgen fix)
#1713 (leave issue open)
#984 (LLVM issue.)
#1201 (Leave issue open)
#1013 (Fix after merge)
Possibly breaking changes
ARM_CC_
->ARMCC
VPUSH
,VPOP
).GROUP_INT
)RET
,INT
should be added via Mapper separately.ARM_GRP_CRC
are renamed to match LLVM nameing:ARM_FEATURE_HasCRC
V8
,MCLASS
,ARM
,THUMB
) because instruction aliases are supported now. So it matters.[]
brackets were added.writeback
is part ofdetail
and no longer ofdetail.arm
.r15 = pc
etc.) are no longer printed as default. Must be enabled viaCS_OPT_SYNTAX_CS_REG_ALIAS
or-a
for thecstool
.uint32_t
, no longerint32_t
.disp
sign andsubtracted
fag were inconsistent.ITState
is no longer reset after eachcs_disas
call. Only aftercs_close
.