Skip to content

Commit

Permalink
[SharedCache] Process the .symbols cache file
Browse files Browse the repository at this point in the history
A significant number of symbols are not being defined because there is currently no support for parsing the .symbols cache file. This commit adds that support.

`m_symbolInfos` has been modified to be a vector of references to `Symbol`s which is similar to [this PR](Vector35#6197). It makes more sense for the use case where `m_symbolInfos` is used as a symbol cache, otherwise a bunch of time is spent transforming between the old style of `m_symbolInfos` entries to Binary Ninja `Symbol`s.

This commit does require a metadata version bump. I felt this was necessary to determine which symbols to load from the symbols cache. The problem is that the `m_images` container does not store the images in the order they are found in the DSC. The index they are at determines the location of their symbols in the symbols cache file. Rather than converting `m_images` to a vector and relying on its ordering being correct, it seemed more prudent to store the index of the image in the `CacheImage` structure. As this is serialized, the metadata version has to be bumped to accomodate the change.
  • Loading branch information
WeiN76LQh authored and WeiN76LQh committed Nov 28, 2024
1 parent f990bc8 commit 722f804
Show file tree
Hide file tree
Showing 10 changed files with 377 additions and 143 deletions.
131 changes: 123 additions & 8 deletions view/macho/machoview.h
Original file line number Diff line number Diff line change
Expand Up @@ -268,14 +268,129 @@ typedef int vm_prot_t;
#define SEG_UNIXSTACK "__UNIXSTACK"
#define SEG_IMPORT "__IMPORT"

//Symbol Types (N_TYPE)
#define N_UNDF 0x0
#define N_ABS 0x2
#define N_SECT 0xe
#define N_PBUD 0xc
#define N_INDR 0xa

#define N_ARM_THUMB_DEF 0x0008
/*
* Symbols with a index into the string table of zero (n_un.n_strx == 0) are
* defined to have a null, "", name. Therefore all string indexes to non null
* names must not have a zero string index. This is bit historical information
* that has never been well documented.
*/

/*
* The n_type field really contains four fields:
* unsigned char N_STAB:3,
* N_PEXT:1,
* N_TYPE:3,
* N_EXT:1;
* which are used via the following masks.
*/
#define N_STAB 0xe0 /* if any of these bits set, a symbolic debugging entry */
#define N_PEXT 0x10 /* private external symbol bit */
#define N_TYPE 0x0e /* mask for the type bits */
#define N_EXT 0x01 /* external symbol bit, set for external symbols */

/*
* Only symbolic debugging entries have some of the N_STAB bits set and if any
* of these bits are set then it is a symbolic debugging entry (a stab). In
* which case then the values of the n_type field (the entire field) are given
* in <mach-o/stab.h>
*/

/*
* Values for N_TYPE bits of the n_type field.
*/
#define N_UNDF 0x0 /* undefined, n_sect == NO_SECT */
#define N_ABS 0x2 /* absolute, n_sect == NO_SECT */
#define N_SECT 0xe /* defined in section number n_sect */
#define N_PBUD 0xc /* prebound undefined (defined in a dylib) */
#define N_INDR 0xa /* indirect */

/*
* If the type is N_INDR then the symbol is defined to be the same as another
* symbol. In this case the n_value field is an index into the string table
* of the other symbol's name. When the other symbol is defined then they both
* take on the defined type and value.
*/

/*
* If the type is N_SECT then the n_sect field contains an ordinal of the
* section the symbol is defined in. The sections are numbered from 1 and
* refer to sections in order they appear in the load commands for the file
* they are in. This means the same ordinal may very well refer to different
* sections in different files.
*
* The n_value field for all symbol table entries (including N_STAB's) gets
* updated by the link editor based on the value of it's n_sect field and where
* the section n_sect references gets relocated. If the value of the n_sect
* field is NO_SECT then it's n_value field is not changed by the link editor.
*/
#define NO_SECT 0 /* symbol is not in any section */
#define MAX_SECT 255 /* 1 thru 255 inclusive */

/*
* The bit 0x0020 of the n_desc field is used for two non-overlapping purposes
* and has two different symbolic names, N_NO_DEAD_STRIP and N_DESC_DISCARDED.
*/

/*
* The N_NO_DEAD_STRIP bit of the n_desc field only ever appears in a
* relocatable .o file (MH_OBJECT filetype). And is used to indicate to the
* static link editor it is never to dead strip the symbol.
*/
#define N_NO_DEAD_STRIP 0x0020 /* symbol is not to be dead stripped */

/*
* The N_DESC_DISCARDED bit of the n_desc field never appears in linked image.
* But is used in very rare cases by the dynamic link editor to mark an in
* memory symbol as discared and longer used for linking.
*/
#define N_DESC_DISCARDED 0x0020 /* symbol is discarded */

/*
* The N_WEAK_REF bit of the n_desc field indicates to the dynamic linker that
* the undefined symbol is allowed to be missing and is to have the address of
* zero when missing.
*/
#define N_WEAK_REF 0x0040 /* symbol is weak referenced */

/*
* The N_WEAK_DEF bit of the n_desc field indicates to the static and dynamic
* linkers that the symbol definition is weak, allowing a non-weak symbol to
* also be used which causes the weak definition to be discared. Currently this
* is only supported for symbols in coalesed sections.
*/
#define N_WEAK_DEF 0x0080 /* coalesed symbol is a weak definition */

/*
* The N_REF_TO_WEAK bit of the n_desc field indicates to the dynamic linker
* that the undefined symbol should be resolved using flat namespace searching.
*/
#define N_REF_TO_WEAK 0x0080 /* reference to a weak symbol */

/*
* The N_ARM_THUMB_DEF bit of the n_desc field indicates that the symbol is
* a defintion of a Thumb function.
*/
#define N_ARM_THUMB_DEF 0x0008 /* symbol is a Thumb function (ARM) */

/*
* The N_SYMBOL_RESOLVER bit of the n_desc field indicates that the
* that the function is actually a resolver function and should
* be called to get the address of the real function to use.
* This bit is only available in .o files (MH_OBJECT filetype)
*/
#define N_SYMBOL_RESOLVER 0x0100

/*
* The N_ALT_ENTRY bit of the n_desc field indicates that the
* symbol is pinned to the previous content.
*/
#define N_ALT_ENTRY 0x0200

/*
* The N_COLD_FUNC bit of the n_desc field indicates that the symbol is used
* infrequently and the linker should order it towards the end of the section.
*/
#define N_COLD_FUNC 0x0400

/*
* An indirect symbol table entry is simply a 32bit index into the symbol table
Expand Down
2 changes: 1 addition & 1 deletion view/sharedcache/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ endif()
set(HARD_FAIL_MODE OFF CACHE BOOL "Enable hard fail mode")
set(SLIDEINFO_DEBUG_TAGS OFF CACHE BOOL "Enable debug tags in slideinfo")
set(VIEW_NAME "DSCView" CACHE STRING "Name of the view")
set(METADATA_VERSION 2 CACHE STRING "Version of the metadata")
set(METADATA_VERSION 3 CACHE STRING "Version of the metadata")

add_subdirectory(core)
add_subdirectory(api)
Expand Down
3 changes: 2 additions & 1 deletion view/sharedcache/api/python/_sharedcachecore.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ def free_string(value:ctypes.c_char_p) -> None:
BNFreeString(ctypes.cast(value, ctypes.POINTER(ctypes.c_byte)))

# Type definitions
BackingCacheTypeEnum = ctypes.c_int
from binaryninja._binaryninjacore import BNBinaryView, BNBinaryViewHandle
class BNDSCBackingCache(ctypes.Structure):
@property
Expand Down Expand Up @@ -110,7 +111,7 @@ class BNSharedCache(ctypes.Structure):
# Structure definitions
BNDSCBackingCache._fields_ = [
("_path", ctypes.c_char_p),
("isPrimary", ctypes.c_bool),
("cacheType", BackingCacheTypeEnum),
("mappings", ctypes.POINTER(BNDSCBackingCacheMapping)),
("mappingCount", ctypes.c_ulonglong),
]
Expand Down
13 changes: 10 additions & 3 deletions view/sharedcache/api/python/sharedcache.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,21 @@ def __repr__(self):
@dataclasses.dataclass
class DSCBackingCache:
path: str
isPrimary: bool
cacheType: BackingCacheType
mappings: list[DSCBackingCacheMapping]

def __str__(self):
return repr(self)

def __repr__(self):
return f"<DSCBackingCache {self.path} {'Primary' if self.isPrimary else 'Secondary'} | {len(self.mappings)} mappings>"
match self.cacheType:
case BackingCacheType.BackingCacheTypePrimary:
cacheTypeStr = 'Primary'
case BackingCacheType.BackingCacheTypeSecondary:
cacheTypeStr = 'Secondary'
case BackingCacheType.BackingCacheTypeSymbols:
cacheTypeStr = 'Symbols'
return f"<DSCBackingCache {self.path} {cacheTypeStr} | {len(self.mappings)} mappings>"


@dataclasses.dataclass
Expand Down Expand Up @@ -136,7 +143,7 @@ def caches(self):
mappings.append(mapping)
result.append(DSCBackingCache(
value[i].path,
value[i].isPrimary,
value[i].cacheType,
mappings
))

Expand Down
6 changes: 6 additions & 0 deletions view/sharedcache/api/python/sharedcache_enums.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
import enum


class BackingCacheType(enum.IntEnum):
BackingCacheTypePrimary = 0
BackingCacheTypeSecondary = 1
BackingCacheTypeSymbols = 2


class DSCViewLoadProgress(enum.IntEnum):
LoadProgressNotStarted = 0
LoadProgressLoadingCaches = 1
Expand Down
2 changes: 1 addition & 1 deletion view/sharedcache/api/sharedcache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ namespace SharedCacheAPI {
{
BackingCache cache;
cache.path = value[i].path;
cache.isPrimary = value[i].isPrimary;
cache.cacheType = value[i].cacheType;
for (size_t j = 0; j < value[i].mappingCount; j++)
{
BackingCacheMapping mapping;
Expand Down
2 changes: 1 addition & 1 deletion view/sharedcache/api/sharedcacheapi.h
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ namespace SharedCacheAPI {

struct BackingCache {
std::string path;
bool isPrimary;
BNBackingCacheType cacheType;
std::vector<BackingCacheMapping> mappings;
};

Expand Down
8 changes: 7 additions & 1 deletion view/sharedcache/api/sharedcachecore.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ extern "C"
LoadProgressFinished,
} BNDSCViewLoadProgress;

typedef enum BNBackingCacheType {
BackingCacheTypePrimary,
BackingCacheTypeSecondary,
BackingCacheTypeSymbols,
} BNBackingCacheType;

typedef struct BNBinaryView BNBinaryView;
typedef struct BNSharedCache BNSharedCache;

Expand Down Expand Up @@ -97,7 +103,7 @@ extern "C"

typedef struct BNDSCBackingCache {
char* path;
bool isPrimary;
BNBackingCacheType cacheType;
BNDSCBackingCacheMapping* mappings;
size_t mappingCount;
} BNDSCBackingCache;
Expand Down
Loading

0 comments on commit 722f804

Please sign in to comment.