Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework of the lua source hierachy to support a unified apporach to ESP8266 and ESP32 #1661

Open
TerryE opened this issue Dec 10, 2016 · 66 comments

Comments

@TerryE
Copy link
Collaborator

TerryE commented Dec 10, 2016

Background

The Lua source hierarchy is currently based on the following fork tree:

  • Standard Lua 5.1.5 which is designed to compile on any POSIX toolchain and released in Feb 2012.

  • eLua is based on 5.1.5 but has a number of enhancements and changes to optimise it for embedded use, including use of newlib toolchain and a set of template hardware drivers. The main enhancements to the Lua core were in the inclusion of an Emergency Garbage Collector (EGC) implementation of lightweight C functions and ROM-based tables.

  • NodeMCU was again a fork of eLua by Zeroday, but because the non-OS SDK structure and execution models where quite different from the eLua assumptions, NodeMCU really only used the Lua core components. Also the SDK interface is quite different to newlib one, so this fork also introduced a lot of code changes (e.g. the string library calls were replaced by the equivalent SDK cstring ones.)

Issues

  • Since the initial 1.x versions of NodeMCU, we have subsequently addressed many of the source level conflicts between newlib and the SDK libraries which in turn mean that many of the original nodeMCU changes are no longer needed. Moreover

  • The ESP32 IDK is newlib-based and layered on RTOS and therefore is incompatible with these initial NodeMCU changes.

  • The subsequent Lua versions have incorporated many useful components, for example 5.2 incorporates the eLua ECG and lightweight C functions, and therefore provides most of what we use from eLua. The only additional bit that we use is ROTables and we are currently investigating reimplementing this because of the performance hit on the ESP flash architecture. 5.3 supports integers as a native type alongside floating point. The eLua project itself is now pretty moribund (only 11 new threads on the eLua DL during 2016).

Whilst we have arrived at our current NodeMCU Lua code base in a set of logical steps, the reality is that this could be viewed as a dead-end from a maintenance perspective. I believe that we should now take stock and decide together how we approach a scenario where we wish to maintain a common Lua code base for both NodeMCU ESP variants.

  • Do we believe that is is worth maintaining a common Lua code base for the two projects?
  • Do we go back towards the newlib eLua reverting no longer needed changes?
  • Do we move forward onto the 5.2 or 5.3 code?

More thought and discussion is needed, but my view is that this should be around options to maintain a common code base.

@djphoenix
Copy link
Contributor

  1. I really prefer to maintain as much common code as possible, because (I believe) sooner or later Espressif may release IDK for ESP8266-RTOS.
  2. I vote up for update lua codebase to latest one (maybe 5.3), this have very good enchancements. Also I prefer to get it clean as possible, for ability to update it when new patches or versions released.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 12, 2016

Hummm, things get a little more complicated with 5.3.3. For a start it has proper 32-bit support as an option. Lua 5.1.5 uses a unified number type which is a 64 bit floating point and which can accurately represent integers up to 2^52, IIRC. With Lua 5.3.3 32-bit build this instead uses separate types for int32 and 32 bit FP. OK, we would lose the exact large integer support for integers > 2^31, but this a lot faster on the 8266, and a lot faster again on the ESP32 since this has H/W support for 32-bit FP.

@jmattsson
Copy link
Member

That does sound tempting considering the hardware we're running on.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 13, 2016

The only issue is that there seem to be bleats about increased memory usage in 5.3. Still researching :(

PS. I can't see any obvious changes other than strings are now only interned if 40 bytes or shorter. In fact there are now two string types: short strings which are unique and long strings which aren't. I guess this is to avoid the cost of hashing long strings, since it is assumed that these are rarely duplicated. But note that they are still copied by reference and hence can be reused, it's just that long strings aren't put into the string hash table, so

local s = '012345678901234567890123456789'
local a = s .. s
local b = b
local c = s .. s

will now create two identical 60 character strings '012...789', one used by a and b, and a second by c.

The reasons for this are pragmatic. The hash function is O(N) and the chances of two long string being accidentally identical are small so there are strong arguments for only interning strings under a length threshold.
And a LRU cache has been added for accessing C strings to avoid hashing short Cstrings in the case of a cache hit, but a bit bizarre implementation.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 13, 2016

Having done a bit of the work, and though about this some more overnight, I feel that the next step is for me to add a rotables mod into the 5.3.3 build, so that it can be built both as a POSIX standalone (under x86) and under the xtensa toolchain with a tweaked node library so that it can also run as a firmware image on the ESP8266 (or ESP32) for evaluation and to inform discussion as to whether we should proceed with a proper integration into an SDK 2.x build. I'll set up a clean repository so that others can track this.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 17, 2016

A quick update.

I am getting to grips with Lua version 5.3.3 and the differences from 5.1.5. Perhaps the main one is that a lot of optimisations and enhancements to support embedded implementations are now in the core -- such as the EGC. The one big omission is no support for rotables. What I have done so far is to add support for RO TString constants, with some extra TS variants of get and set methods so that you can use them in the C API. So now LUA_TS(__index) in the code is a TString * to the extern in ROM which is the TString for "__index". The ROstrt is in the searchlist for creating new strings so references to __index in any compiled Lua code will pick up the same TString instead of creating a new one in RAM. Hence string comparisons use a pointer comparison, occasionally dropping down to a memcmp (since now the length of both strings are known) rather than strcmp(), so less entries in the RAM G[L]->strt and in principle a lot less unaligned ROM access exceptions.

The ROstrt is generated by a preprocess of the code which detects these LUA_TS() usages and generates a c header file and an assember file which contains the static initialisers. It will take me another couple of days to get this working fully on the x86 platform. At this stage, I'll push a copy to my github repo for anyone interested to review.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 27, 2016

Another quick update. I've got Lua 5.3 working with ROstrt initialised in the .text segment with the 303 common TStrings used in the compiler and core libraries, with the extra API methods to use TString parameters directly when appropriate. Need to add the rest of the ROtable patch next and extend the test suite. Next build will be on a 32-bit Ubuntu VM with 32 ints and floats. Upwards and onwards.

@TerryE
Copy link
Collaborator Author

TerryE commented Jan 6, 2017

I've done a rough cut of my working paper documenting progress in this gist: LROR (Lua Read-Only Resources) in Lua 5.3. That's on top of fitting my new kitchen and becoming a grandfather (though the labour on this one was down to my daughter!) Any general comments then feed them back here 😄

I am currently hammering the table implementation on an x86 build.

@jmattsson
Copy link
Member

Two comments (other than "good job!") so far:

  1. Pretty please with sugar on the top, don't #include C_HEADER_STDIO in the final stuff; Let's do the cleanup on the ESP8266 side rather than pollute the POSIX and ESP32 sides. It's largely the custom-modded eLua stuff which needed the c_stdio.h stuff in the first place, and if we're replacing it all, we really shouldn't need it.
  2. Is there a version of LROR_WORD(x) than can handle strings with non-C-identifier symbols in them? LROR_WORD(x and y, with some z!) clearly wouldn't give a valid extern declaration. Or are LROR strings limited in this regard?

@TerryE
Copy link
Collaborator Author

TerryE commented Jan 6, 2017

@jmattsson, picking up these:

  1. I went through some of this with the luac.cross port. The issue that we have with GCC is that the search list for system includes is configured in the compiler build, and it got very convolved. The xtensa toolchain is based on newlib libraries, which are different to the standard Linux libraries, and the non-OS SDK ones are different again. I agree that we might be able to back-port most of these SDK changes into ESP8622 toolchain newlib libraries, but there will still be some incompatibilities between this and the Linux set. At the end of the day, IMO it is more important that we have a single code-base for all three targets (ESP8266, ESP32 and Linux) and this preprocessor define trick works fine. This comes into the category of polish rather than show-stopper.
  2. Yup LROR_STRING(does_not_compute, "Does not compute"). The same string can can be declared twice with different symbolic names and this will be handled, but the same symbolic name can't be used with two strings.

@jmattsson
Copy link
Member

  1. It shouldn't be too bad. As the first stage, simply renaming our c_whatever.h files to whatever.h and using -nostdinc will preference our "standard C" headers. Adding a -isystem "$(xtensa-lx106-elf-gcc -print-sysroot)/usr/include" will get the compiler to find the HAL etc (and any non-overridden headers). And yes, while in some ways it's polish, it's also about reducing the number and extent of Lua source files that have to be modified. Feel free to ping me when you're at a stage you're wanting to start integrating and I'll try to get some time to lend a hand (which I probably should for the ESP32 side anyway).
  2. Nice!

@TerryE
Copy link
Collaborator Author

TerryE commented Jan 7, 2017

@jmattsson Johny, I am a little less concerned about the issue of minimising the changes. First step is t evaluate viability and benefits. And if we have a go then we review some of the implementation strategy. I did this with the LCD patch and did a master diff before the PR. I went through this diff and in a few areas reworked the modification to minimise the code change.

Once I have a clean version which passes my extended test suite, then I think that we're at the point where I can push the branch for review and evaluation.

@jmattsson
Copy link
Member

Oh yes, you might've noticed I said "in the final stuff" in regard to point 1.

It's all sounding very promising so far though. This could be as big a RAM improvement as when we originally managed to get the C-string constants moved into flash. No pressure... ;)

@TerryE
Copy link
Collaborator Author

TerryE commented Feb 7, 2017

Just another quick update. I am still plugging away at this making steady progress. It's just that I am time-slicing it with little jobs like fitting the new kitchen in my new house and doing the plumbing. There are all sorts of subtleties involved here, so I am not rushing and doing lots of brood time so the changes don't compromise some of the nice features of 5.3 and also don't cause unacceptable performance hits.

A little example is that one of the things that you want to do for example it to set the metatable for a ROtable using the normal API. You do this if you want to extend string or table, for example. But the metatable field in standard Lua is in a field in the Table structure. You can't do this in the case of ROtables since you need this one field to be RW, so you need to store it the registry. However, you don't want to do this in general since the ability to override almost every data type in 5.3 means that this is checked a lot at low level so for RWtables, you need to keep the field in the Table record for RW variants. Hey-Ho.

Anyway, I've updated the White Paper in my Gist with current progress. As soon as I've got a standalone working compliant to the paper, I'll upload it to my repo.

@TerryE
Copy link
Collaborator Author

TerryE commented Mar 26, 2017

@flip111 I am multiplexing a lot of issues at the moment. A straight port of Lua 5.3 would be reasonably easier but would not perform as well, nor ave anywhere as near as small a RAM footprint as our currently esp8266-optimised 5.1 version. So I have to merge in these optimisation features and add a few more because the extra Lua 5.3 feature set comes with an extra RAM footprint. So slow progress.

Read the working paper linked above :)

@flip111
Copy link
Contributor

flip111 commented Mar 26, 2017

Very interesting how this is evolving, good work @TerryE

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 26, 2017

@jmattsson @pjsg, I am loosing the plot!! I've discussed backporting these enhancements into the current NodeMCU lua 5.1.5 implementation but I can't for the life of me remember where. Anyway, I've done the first cut to compile clean and am in the middle of testing. But on-chip testing is a bitch. Have either of you got a good primer on using the remote gdb stub. Do you use it?

Anyway, I've found the easiest way is to work on luac.cross which I can debug using regular gdb on the laptop. This has been a very useful exercise. If one of you guys can point me to the correct issue / PR then I'll give a fuller update there.

@pjsg
Copy link
Member

pjsg commented Apr 26, 2017

The gdbstub is definitely not perfect, but there is some documentation in the gdbstub module doc. It is really good for catching crashes and poking around..... If you

#define GDBSTUB_BREAK_ON_INIT 1

then it will enter gdb fairly early on (as it is initializing the lua modules). This may be too late, in which case you probably want to move the gdbstub_init() call to somewhere rather earlier in th eboot process.

Also apologies for not being engaged for the last month or so, I've been rather under the weather...

@jmattsson
Copy link
Member

@TerryE Are you referring to https://gist.github.com/TerryE/8afa5022042291b8add1ff3886f6c014 that you linked above?

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 28, 2017

I've used my luac.cross technique to build a stripped done NodeMCU lua core VM running on the host so I can debug the core code on my laptop. Bizarre but a lot quicker and easier than using gdbstub. Onwards!!

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 29, 2017

I am pretty close to having an evaluation version of the NodeMCU Lua 5.1.5 evaluation version ready. As I eluded to earlier this is based on a backport of the RO TString technology that I developed for my Lua 5.3 port, but leaving in the current ROTable implementation rather than trying to implement true RO Lua Table types.

This has been a very useful exercise because it has allowed me to appreciate some issues in concrete form.

Having a common code base

We should have a single Lua core code base that will compile to the host platform (gcc compatible *nix), esp8266 and esp32.

As I mentioned above I already have a version of the NodeMCU lua interpreter running on my dev VM on my laptop. OK, the libraries are different (e.g. io and os work, but not node or file) but we can converge on this later. Having a host-runnable means that full gdb feature are available for testing as well as the Lua test suite, etc. And minimising code variants means that we can have increased confidence that bugs that have been exercised in host testing are also removed from the esp8266 and esp32 variants.

Just to be clear this host version has ROTables, and RO TStrings; it's just that they lie between _etextand _edata rather than in some defined Flash memory address space. Code compiled with this host luac will run on the esp chips as well as code dumped from this host lua*.

What I find really irritating about the NodeMCU code is that it is full of crap, for example chunks of source #if 0 commented out and conditional on defines that we no longer support (for example NodeMCU must be built with full ROtable support). It's probably not worth clearing this out, though to be honest when I am debugging the core code and find this crap is confusing things then I have started to remove it.

However what is very clear to me is that when we come to the Lua 5.3 port we should have a very clear set of objective which then give rise to a set of change templates or patterns, but that we should only make changes above and beyond this that are clearly documented.

Getting the peephole optimisation right

There are some aspects of our Lua implementation which cause a noticeable performance hit for no good reason and I really think that we should address these. In particular, access ROM based readonly tables and elements currently is a lot slower than RAM based ones.

Interim 5.1 update or straight to 5.3?

To be honest this 5.1 version was a pretty important step for me. It will run faster than the current version of the core code; I could extend it so that the ESP32 version could just use this rather than its own variant. It is certainly worth considering a proper NodeMCU branch for its evaluation, but is it worth promoting to dev (at some point after enough testing) or should we just plan to go direct to 5.3?

The points about 5.3 from a developer PoV is that it is not 100% code-compatible with Lua 5.1. It supports separate float and int number types and on the ESP chips, it makes sense to make these both 32 bit, especially on the ESP 32 where these are both supported in H/W . But this will lose the ability to do 52-bit integer arithmetic, so we might need to add a in64 library.

@flip111
Copy link
Contributor

flip111 commented Apr 29, 2017

Better proof first that ESP8266+ESP32 can be combined into 1 firmware before adding lua 5.3 support

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 29, 2017

@flip111, you misread this goal. We will never have one firmware because the two builds target different generation processors and have different underlying platform APIs. The goal here is to have a single Lua core source base and at the moment, the two branches are different. If I can get the same Lua source running on the esp8266 core and on a 64bit Intel core then I can certainly get the same source base running on esp32.

@flip111
Copy link
Contributor

flip111 commented Apr 29, 2017

Seems you are confident that the 5.1 update will work, maybe go for 5.3 then.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 29, 2017

Yup, but the Lua 5.1 has advantages performance and inter-platform, and is fully source code compatible, so as long as it works, then we could promote it to dev. The move to 5.3 really merits a wider discussion and buy--in from the committers.

@jmattsson
Copy link
Member

I'm on the fence regarding 5.1-upgrade vs straight-to-5.3. On the one hand I don't like the extra work required to get the interrim 5.1 in, but on the other hand it would likely be able to go in sooner, which would benefit everyone. As long as there's a reasoned argument, I could support either approach. I would very much like to see the Lua core being cleaned up though - as you've noticed, we've got a metric truck ton of cruft in there. Hopping straight to a 5.3 might be the least painful option.

Regarding integer types, we could perhaps do a userdata int64 with __add etc meta-methods? Would it be possible to auto-promote to such a userdata type when needed? Checking add & mul on the int type should allow for catching overflows and trigger promotion, but I don't know how feasible it would be to augment the 5.3 codebase.

@TerryE
Copy link
Collaborator Author

TerryE commented May 9, 2017

I am still testing the 5.1 backport on a dev build to put it through the Lua test suite clean. I've also done a full code review using meld against a vanilla Lua 5.1 and have made the following observations which swing my view of the one step vs two step path (straight to 5.3 vs 5.1 upgrade then 5.3).

  • The eLua variant add support for the EGC, lightweight C functions and rotables as well as other cross support options such endian and packed vs unpacked support.
  • IMO, the rotables implementation is a bit of a hack as it is very run-time inefficient (lots of linear strcmp scans of flash), and all rotables args are void * types so there is no compile-time validation of a lot of stuff that should be checked at compile-time, which makes testing changes fraught. Also the way of implementing lightweight functions doesn't follow standard Lua coding patterns.
  • NodeMCU uses the eLua Lua core variant for the EGC, lightweight C functions and rotables functionality, but the xtensa architecture doesn't need the endian and packed stuff. (It also discards all of the eLua platform and driver implementations.)
  • Standard Lua 5.3 has added EGC and lightweight function support as well as other changes to support embedded processors not included in eLua (separate integer and float sub-types and the ability to force the number size to 32-bits, for example). However, this is a re-implementation using standard coding patterns.
  • One of the architectural changes that 5.3 implements to do this is to split the ttt type field in Lua Values and Garbage Collectables, into a 4-bit bit field and 4-bit subtype. The following types are sub-typed:
    • Numbers have separate integer and float subtypes
    • Strings have separate interned and non-interned subtypes. Strings of 40 bytes or less are interned. Long ones aren't. This is for performance: hashing is a function of length and in the majority of implementations (especially embedded) the chances of a duplicate instances of a long string is extremely small.
    • Functions have separate full and lightweight subtypes.
  • Lua 5.3 also adds a new separate GC category "don't collect" which I will use to tag all RO resources.
  • In eLua rotables are a separate type that is handles completely differently to normal tables, and hence this implementation leaks out into a lot of code changes in ltable.c, lvm.c, etc.
  • My proposed RO table implementation makes Tables a fourth sub-type type by splitting tables into two separate RW and RO table subtypes, and the difference is only in the low level access routines, and is so is well encapsulated. The RO table implementation internally is still a vector list as with the current eLua rotables implementation, but access is through a look-aside cache so 95% of the RO table access are direct.

So my conclusion is that the changes to get from standard Lua 5.3 to a NodeMCU 5.3 are lot less than cleaning up my current 5.1 WIP version. And in my experience the smaller the change, the less bugs get introduced so the more stable the implementation. So I now propose to release a 5.3 evaluation version next.

Even so, doing the 5.1 exercise has been valuable in that it has forced me to think through the API issues in minimising the impact on the rest of the NodeMCU ecosystem, and has also cleaned up some aspects of my implementation.

@jmattsson
Copy link
Member

👍 Nice summary, thanks!

@TerryE
Copy link
Collaborator Author

TerryE commented Nov 25, 2017

@szmodz, I've been somewhat sidetracked by building a new house. It looks like we've now got a completion date before Christmas for the sale of our old one, so everything is panic mode at the mo.

This work is on hold until I am Inthe new house. My next priority is to get the Lua Flash Store mod into Alpha test.which I am still trying to fit in before the move.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 10, 2019

Gosh. Time passes and we haven't done this. I did do this are part of my abandoned Lua 5.3 port. I have debated the pros and cons of the putative upgrade to 5.3 myself. My main concern is that 5.3 actually buys us very little in features or performance over our current 5.1 implementation, but will introduce some backwards compatibility issues that might break stable existing applications.

If we don't adopt 5.3 then another alternative is to move the existing 5.1 code base to a common code set that will work unmodified on both the ESP8266 and ESP32 platform variants. That we we can move the various enhancements that I have implemented for the ESP8266 onto the ESP32. Thoughts?

@jmattsson
Copy link
Member

My view is that we/the project will be better off long-term going to 5.3. Clinging to old versions often cause more pain in the end than if you were to have gotten it over and done with early. Besides, those of us who want to stick to the "tried and true" also tend to not upgrade the NodeMCU base often either, so for major upgrades like this I feel some lack of backwards compatibility is reasonable. There have also been a number of bug fixes since 5.1.

And, many of the user-visible changes we could provide shims for during a transition period if we so wish. That's my AUD$0.02 :)

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 14, 2019

If we do want to have a migration path to Lua 5.3 then we need to sort out incompatibilities in our module APIs and Lua C API. Perhaps the biggest divergence is the eLua implementation for ROTables. Here a table variable wither points to a Table record in RAM or a ROTable vector in Flash. The latter vector implementation involves serially scanning the vector and this is inefficient, doubly so when the vector is in Flash and cold accesses are at least 20 times slower than RAM ones.

PR #2505 gave a big performance improvement for table-hit accesses but table miss accesses still require a full table scan. I will be raising a new issue covering the implementation and migration to a Lua 5.3 compatible RO Table implementation. I see this as an important step in preparing for any Lua 5.3 implementation.

@stale
Copy link

stale bot commented Apr 8, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 8, 2020
@TerryE
Copy link
Collaborator Author

TerryE commented Apr 10, 2020

Still valid

@pjsg
Copy link
Member

pjsg commented Feb 20, 2021

Now that we have the LUA5.3 migration under our belts, I think that it is time to revisit the ESP32 situation. My reading of the tea leaves is that the ESP32-C3 is going to replace the ESP8266 in the very low end devices. It is just a more modern part with much more ram and better I/O.

I have the ESP32 branch now up and running on this new part (at least the file system is up and running as are several other modules). The network doesn't build, but the wifi module seems to do some of the right things.

However, the ESP32 code base has diverged significantly from the ESP8266 code base. We need to have a plan to migrate to a common source tree with stubs/build options so that we can build for the existing non-RTOS environment and for the RTOS environment for the ESP32 chips. One issue is that the ESP-IDF V4 uses cmake to build and the whole system has a fairly rigid file layout. I don't think that it would be that difficult to use cmake for the ESP32 chips and regular make for the ESP8266.

The ESP32C3 comes up with 230k of heap to start with and I haven't done anything to try and improve that number!

Thoughts on a path forwards?

@jmattsson
Copy link
Member

@pjsg I did draw up a plan for the merging of the 8266 and 32 branches quite some time ago; please see internal discussion thread 9 - "Proposal for way towards merging the 8266/32 branches". Unfortunately, between my lack of time and the major 5.3 work that I didn't want to interfere with, only step 1 has been executed on. And now I don't know whether that would be compatible with the 4.0 IDF on the esp32/C3 side?

@pjsg
Copy link
Member

pjsg commented Feb 20, 2021

@jmattsson I can't actually find that thread -- I can find https://github.com/nodemcu/nodemcu-firmware/issues/1319 

The principal differences between the esp32-c3 and the regular flavor are: 

  • single core
  • risc-v rather than tensilica

The ESP-IDF4:

  • cmake based

Also, now are strongly discouraged from talking to devices (e.g. the uarts) directly -- should go through the RTOS. 

@pjsg
Copy link
Member

pjsg commented Feb 20, 2021

The changes to make it build under the IDF v4 are pretty small -- though there is one thing that I don't know how to do -- I had to patch an IDF file to make it all work (the ld control file).

The good news is that the Espressif people are being very responsive with esp32c3 issues -- e.g. espressif/openocd-esp32#144 and espressif/esp-idf#6561

@jmattsson
Copy link
Member

Wait, they still have an issue with overriding the linker file?? I raised an issue on that like two years ago I thought!

@igrr
Copy link

igrr commented Feb 21, 2021

Hi @pjsg, could you describe the change you had to do in the linker script?

@marcelstoer
Copy link
Member

see internal discussion thread 9

@pjsg https://github.com/orgs/nodemcu/teams/firmware/discussions/9

@pjsg
Copy link
Member

pjsg commented Feb 21, 2021

@igrr In my case, I just wanted to include some LD config directly. I added the single INCLUDE line below. It uses an absolute path as I wasn't sure what the working directory was going to be! Obviously this isn't the right approach:

This is in the esp32-esp-idf/components/esp32c3/ld/esp32c3.project.ld.in file:

  .flash.rodata : ALIGN(0x10)
  {
    _rodata_start = ABSOLUTE(.);

    *(.rodata_desc .rodata_desc.*)               /* Should be the first.  App version info.        DO NOT PUT ANYTHING BEFORE IT! */
    *(.rodata_custom_desc .rodata_custom_desc.*) /* Should be the second. Custom app version info. DO NOT PUT ANYTHING BEFORE IT! */

    INCLUDE /home/philip/eclipse-workspace/nodemcu/components/base_nodemcu/ld/nodemcu_rodata.ld

    mapping[flash_rodata]

    *(.irom1.text) /* catch stray ICACHE_RODATA_ATTR */
    *(.gnu.linkonce.r.*)
    *(.rodata1)
    __XT_EXCEPTION_TABLE_ = ABSOLUTE(.);

The actual contents of the included file are:


    /* Don't change the alignment here without also updating the _Static_assert
     * over in linit.c! */
    . = ALIGN(8);
    /* Link-time arrays containing the defs for the included modules */
    lua_libs_map = ABSOLUTE(.);

    KEEP(*(.lua_libs)) 
    /* Null-terminate the array, 24 bytes */
    LONG(0) LONG(0) LONG(0) LONG(0) LONG(0) LONG(0) 

    /* Don't change the alignment here without also updating the _Static_assert
     * over in linit.c! */
    . = ALIGN(8);
    lua_rotables_map = ABSOLUTE(.);
    KEEP(*(.lua_rotable))
    /* Null-terminate the array, 24 bytes */
    LONG(0) LONG(0) LONG(0) LONG(0) LONG(0) LONG(0)

    /* Don't change the alignment here without also updating the _Static_assert
     * over in nodemcu_esp_event.h! */
    . = ALIGN(4);
    esp_event_cb_table = ABSOLUTE(.);
    KEEP(*(.lua_esp_event_cb_table))
    LONG(0) LONG(0) /* Null-terminate the array, 8 bytes */

    /* ----- End NodeMCU link-time arrays ------- */

@igrr
Copy link

igrr commented Feb 22, 2021

I see, thanks. While replacing the IDF linker script with a user-provided one is unlikely to be supported (it is error-prone, especially if IDF gets updated), we can extend the linker script generation mechanism to support such link-time arrays generation.

@pjsg
Copy link
Member

pjsg commented Feb 22, 2021

@igrr That would be great. Alternatively, allow an include statement as part of the linker mappings.

@jmattsson
Copy link
Member

I had a look at old issues I'd raised and it seems I've only imagined raising a feature request for more flexible linker script generation :( Apologies for unfairly cast shade.

@projectgus
Copy link

projectgus commented Jul 13, 2021

Sorry for the very long delay in responding. A few months back we added linker script generation support for marking output symbls as KEEP and also for marking them with _start and _end symbols, which should allow for this kind of support.

https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/linker-script-generation.html#types

Can search for "KEEP" and "SURROUND" here (the actual description is a few pages down from the anchor.)

However this doesn't support the "8 terminating words" method, just symbol markers for start & end.

Sample linker.lf linker fragment file:

[mapping:test]
archive: libfreertos.a
entries:
     * (default);
          text -> flash_text KEEP SORT(name, alignment) ALIGN(4, pre, post) SURROUND(test_symbol)

Generates:

.flash.text {
    ...

    . = ALIGN(4)
    _test_symbol_start = ABSOLUTE(.)
    KEEP(*libfreertos.a (SORT(.text) ...))
    . = ALIGN(4)
    _test_symbol_end = ABSOLUTE(.)
    ...
}

If this isn't enough for nodemcu requirements[*] then please consider opening a feature request at https://github.com/espressif/issues/new with the requirements you have for the ESP-IDF linker script, and we'll try to find a way to support it in the build system.

[*] For example: if you need to pull in sections from multiple ESP-IDF component libraries into one array, or if you need the "8 null words" termination instead of looping from ..._start to ..._end

@jmattsson
Copy link
Member

Hi @projectgus! I'm actually working on this transition right now (chatter over on #3397). The KEEP/ALIGN/SURROUND should be fine for our needs, and I'm hoping the "embedding binary data" support will see me through the inclusion of pre-generated Lua bytecode. That's on my to-do list for tomorrow, together with working out how to put together the build of the build-host run compiler without having HOST_CC and co at my disposal.

@igrr
Copy link

igrr commented Jul 13, 2021

@jmattsson I saw this the other day, assuming you can accept Python as a dependency it might do the trick: https://pypi.org/project/ziglang/ (plus https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html)

@jmattsson
Copy link
Member

While I'm not exactly a Python fan (I am after all working on a Lua project, not embedded Python 😂), I have to admit I'm seriously intrigued by that, @igrr! I might not get that in right off the bat, but it absolutely merits a deeper look. If it lives up to its promise that could be a very nice way to handle cross compilation.

So you're clearly about to add xtensa l106/lx6 support to it as well, right? 😉 One compiler front-end to rule them all...

@igrr
Copy link

igrr commented Jul 13, 2021

Just to be clear I meant solving the problem of compiling the host tools when you don't necessarily have HOST_CC; that's how I understood your "how to put together the build of the build-host run compiler" sentence :) Using ziglang-cc to cross-compile for an embedded target is also possible but that's not what I had in mind.

So you're clearly about to add xtensa l106/lx6 support to it as well, right?

Yes, that currently depends on espressif/llvm-project#4 (LLVM backend targeting Xtensa).

@jmattsson
Copy link
Member

Yes, that's the correct interpretation, and yes, that is what I took you to mean :)
I was then, mostly in jest, thinking that it'd be neat to have it embedded target support in zig as well, not being aware that there was already an ongoing LLVM effort for Xtensa. So my joke fell flat, but in the best possible way!

@jmattsson
Copy link
Member

I don't suppose there's someone skilled in cmake who can tell me why my dependencies aren't working as expected?

In https://github.com/DiUS/nodemcu-firmware/blob/dev-esp32-idf4/components/embedded_lfs/CMakeLists.txt I want to ensure that ${BUILD_DIR}/lua.flash.store.reserved gets rebuilt after ${BUILD_DIR}/luac.out has been updated between idf.py build invocations. I've worked around it by having tools/embed_lfs.sh nuke the ${BUILD_DIR}/lua.flash.store.reserved file to force the rebuild, but that's a bit inelegant.

(And before someone tries to use this branch to build and run NodeMCU - please don't; I haven't dealt with the deprecated/removed APIs yet so a few big chunks are still commented out, and I'm not expecting what's built to even start up at this point.)

@jmattsson
Copy link
Member

TIL that you can override a symbol entry entirely with -Wl,-defsym=<target>=<source>. And that, thankfully lets me get our link-time arrays get put together sufficiently to make things work.
Slowly getting there...

@jmattsson
Copy link
Member

For those following along at home, I've spent a couple of days looking at ways to support both the esp32 IDF and the 8266 RTOS IDF in the one branch and as far as I can see that's just not an option for us at this point.

With the 8266 RTOS SDK's tools (kconfig, ldgen) being old in comparison to the esp32 IDF's, I'm hitting no end of incompatibilities. While I have a workaround for the lack of ldgen features, I'm still being bitten by things like missing "rsource" support in the Kconfig files, and if I try to upgrade the IDF (tools dir) alone, then I'm missing dependencies in the RTOS components and hitting linker issues like missing dependencies, and even if I slap in a -Wl,--start-group it dies due to the vector table functions being too far offset from the vector table itself. Obviously this is all technically fixable, but it's looking like a job way too big for me as it would, as far as I can see, effectively be creating a new repo that has only the code from the ESP8266_RTOS_IDF and a completely rewritten component structure together with the lifted idf-v4 from the esp32-idf. Which of course would be a maintenance nightmare if done my anyone other than Espressif themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants