In the first version I had done an imitation of Position Independent Code,
but it turned out that there is a much simpler and more elegant solution - the one provided by the compiler itself
Project Version 2.0.0
OpenAPI is the sharing of Kernel ( Firmware ) functions to Userware applications
Note: no Memory Management Unit ( MMU )
Suitable for IoT ( WiFi, GSM, LoRa ... etc ) modules
You manufacture IoT modules
They are probably controlled with an external MCU and AT commands - technology from the last century
There is probably free, unusable space in their memory
Integrate OpenAPI and your customers will be able to write their own applications ( without external management )
Traditionally called: Position Independent Code ( PIC )
https://en.wikipedia.org/wiki/Position-independent_code
In order for applications to work regardless of their absolute address relative to the kernel, they use:
- Procedure Linkage Table PLT ( PLT )
- Global Offset Table ( GOT )
- Dynamic Linker ( Kernel Procedure )
NOTE: I will use Arduino as an integration example
When you compile your application, the compiler does not know where the shared kernel functions that the application will use are located
for example: the kernel functions pinMode(), digitalRead(), digitalWrite() are in the Kernel and are shared for use
The compiler creates a PLT table, more precisely veneer functions by renaming their names to:
pinMode@plt, digitalRead@plt, digitalWrite@plt and their code looks like:
PLT or ram veneers
pinMode@plt: jump got[1]
digitalRead@plt: jump got[2]
digitalWrite@plt: jump got[3]
...
GOT[] or indexed array
got[1] = 0 ? ZERO does not know the address of the function
got[2] = 0 ?
got[3] = 0 ?
...
and so your application compiles without errors, and the code looks like:
pinMode() --> pinMode@plt: jump got[1]=0 <-- no address
How to launch the app
The kernel loads the application somewhere in RAM and
the Dynamic Linker overwrites the GOT table with the absolute addresses of the shared functions
pinMode() --> pinMode@plt: jump got[1]=0x82001342 <-- real address
Unfortunately, all this uses a lot of resources, and more detailed information can be found on the web
as example: https://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries
For example: ARM Cortex M4 with several megabytes of ROM & RAM
Another example: a GSM LPWA NB-IoT module that integrates an Arduino Core,
which is shared for use by Userware Arduino applications
Compilers have a perfect mechanism for compiling Position Independent Code, and we are only required to arrange the code.
When we include the -fPIC flag, the compiler adds several sections necessary for the relocation of shared objects
.rel.dyn, .rel.dyn, .dynsym, .dynstr
These are tables of addresses - which, where is in the application
for example .rel.dyn is an array (table) for all shared objects with structure:
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_info;
} Elf32_Rel;
where r_offset identifies the location of the object to be adjusted.
or the address of the shared object (function, variable...) in the application address space
and r_info identifies the patch type and its index in the ELF Symbol Table
Detailed information can be found on the web...
As I shared above we are only asked to line up our code, and for this we need a modified two-part linker script
The second part is a normal script like for a static application
The first part begins with a header or information about the application and the addresses of certain sections,
and they are the standard .bss and .data to initialize our variables.
as well as the additional Position Independent Code tables, which we arrange immediately after the header
SECTIONS
{
. = ROM_BASE;
/***** HEADER *****/
.initdata :
{
LONG( 0xFECAFECA ); /* APP MAGIC */
LONG( 0x12345678 ); /* API VERSION */
LONG((ABSOLUTE(app_entry + 1))); /* APP ENTRY */
LONG( APP_STACK ); /* APP STACK optional */
LONG((ABSOLUTE( _data_load ))); /* copy .data */
LONG((ABSOLUTE( _data_start )));
LONG((ABSOLUTE( _data_end )));
LONG((ABSOLUTE( _bss_start ))); /* clear .bss */
LONG((ABSOLUTE( _bss_end )));
LONG((ABSOLUTE( _rel_dyn ))); /* ELF Relocation Table : Elf32_Rel */
LONG((ABSOLUTE( _dyn_start ))); /* ELF Symbol Table : Elf32_Sym */
LONG((ABSOLUTE( _str_start ))); /* ELF String Table : const char * */
/* reserved, align 64 bytes */
LONG(0);
LONG(0);
LONG(0);
LONG(0);
} > FLASH
.rel.dyn : /* ELF REL Relocation Table : Elf32_Rel */
{
_rel_dyn = .;
*(.rel.dyn)
_rel_dyn_end = .;
} > FLASH
.rel.plt : /* ELF JMPREL Relocation Table : Elf32_Rel */
{
_rel_plt = .;
*(.rel.plt)
_rel_plt_end = .;
} > FLASH
.dynsym : /* ELF Symbol Table : Elf32_Sym */
{
_dyn_start = .;
*(.dynsym)
_dyn_end = .;
} > FLASH
.dynstr : /* ELF String Table */
{
_str_start = .;
*(.dynstr)
} > FLASH
.................
APP MAGIC and API VERSION are information that this is a user application,
APP ENTRY is the entry "reset" vector of the application itself
With the first two, we inform the kernel that this is indeed a userware application to be launched from the address APP ENTRY
All that's left is to trick the compiler ... make it import the shared objects ... very easy
The compiler only needs to construct the relocation tables.
We create a single C file with all shared objects without their types, just a void foo(void) function
and compile it as -fPIC library
void api_syscall(void){}
void api_malloc(void){}
void api_realloc(void){}
void api_calloc(void){}
void api_free(void){}
....etc, all shared objects
CC_OPTIONS=-mcpu=cortex-m4 -mthumb -msingle-pic-base -fPIC -Wall
GCC_PATH=C:/Users/1124/.platformio/packages/toolchain-gccarmnoneeabi/bin/
all:
@echo * Creating OpenAPI PIC Library
$(GCC_PATH)arm-none-eabi-gcc $(CC_OPTIONS) -g -Os -c OpenAPI-shared.c -o OpenAPI-shared.o
$(GCC_PATH)arm-none-eabi-gcc -shared -Wl,-soname,libopenapi.a -nostdlib -o libopenapi.a OpenAPI-shared.o
We have the linker script and the library, it remains to compile the application - like a normal application,
but with the -fPIC flag
Something complicated?
Kernel Application loader
The Loader are a few simple functions: check for valid app, initialize .data & .bss and relocate shared objects.
The check in this example is simple - we check if a constant address in the flash contains
APP MAGIC and API VERSION and limit APP ENTRY
Then we copy the data for section data and reset the bss section
Relocation is also not a big deal, this solution doesn't use MMU and we don't need complex address recalculation of shared objects
With the indices of the shared objects from the ELF REL Relocation Table we check ( strcmp() ) whether we have shared the searched object
and if yes: we overwrite its r_offset with the address of the real function
switch (REL->r_info & 0xFF)
{
case R_ARM_GLOB_DAT: // variables
case R_ARM_JUMP_SLOT: // functions
*(uint32_t *)REL->r_offset = function_address; // replace address
break;
default:
PRINTF("[ERROR][API] REL TYPE: %d\n", REL->r_info & 0xFF);
return -12;
}
And finally we call APP ENTRY
That's all Folks...
NOTE:
The OpenAPI folder contains an examples from a complex SDK Platform that I cannot share due to license restrictions
Basic and simple ! ( watch in Youtube )
Resources:
- RAM & ROM a few for shared functions ( for Arm or Thumb32 )
- Kernel control
- Hidden information
- Option for signed applications
- as example: Arduino peripheral functions: 36 ( GPIO, EINT, ADC, PWM, UART, I2C, SPI )
- etc
The examples in the folders have been tested with
- Mediatek MT2625 - GSM LPWA NB-IoT SoC ( ARMv7 Cortex M4 )
- Mediatek MT2503 - GSM GPRS SoC ( ARMv6 )
- Can be used with any ARM SoC modules...
Similarly OpenAPI is OpenCPU:
- Chinese author - unknown,
- It is used by Quectel - a manufacturer of GSM modules
- Unfortunately - Closed Source with writing peculiarities and underdeveloped SDK over the years
and sorry for my bad English...