-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special strings in Lua #6
Comments
On my version of Mono (3.12.1), However, thinking about this, I'm not sure that I can reasonably fix this issue without breaking existing code. Consider the case where you have non-ASCII characters in a CLR string and you pass that to Lua. It would be encoded by the CLR using the system's default codepage. Lua would receive a string that differs in ordinal values from the CLR string. When extracting that string from Lua, it would be converted back. So we have two paths to consider:
I'm not sure that path 2 can be fixed without breaking path 1. This may require an extension of the Eluant API. Give me some time to think about this. |
Thank you for looking into this problem and your help. Yes, you described the problem very good. It would be nice, if you could find a global solution, but a "small" solution would be good too. A ".AsByteArray()" and ".FromByteArray()" would be enough ;). |
Yes, but that approach has its own disadvantages. After a Lua string value is converted to a The easiest way to solve this would be to convert the string from Lua twice, once through the system's codepage and one using just the ordinals, and store both representations on the A more involved way would be to introduce a new |
If I understand the problem correct, than the problem is the marshaling when transfering the string from Lua to C# and vice versa. Isn't it possible to transfer the string as byte array, so that the marshaling conversion of strings doesn't take place? |
Not without breaking the CLR -> Lua -> CLR path. CLR strings are made of chars, which are 2-byte elements. If a char has an ordinal value out of the range of a byte (>255) then the string can't be marshaled to Lua exactly as it exists in the CLR; some conversion is required. Then there would be an expectation that the string comes back out of Lua as an equal CLR string, which means that the Lua string would need to be converted back. So yes, what you describe would fix the Lua -> CLR -> Lua path, but it would break the CLR -> Lua -> CLR path. |
But the CLR -> Lua -> CLR path was never correct. You couldn't transfer CLR strings in Unicode to Lua. So you have to do a conversion on the fly. Normally you would convert Unicode to UTF-8 and transfer this to a Lua string. Back you would transfer the Lua string to CLR and convert it from UTF-8 to Unicode. I see no possibility to transfer strings between the two systems without converting them. |
Well, then that's a bug. 😃 The best way I can think to resolve this issue would be to have an interface that defines a string converter ( |
No, it isn't a bug. The problem is, that Lua doesn't have "string". They are normal byte arrays. But CLR uses Unicode. So how should this be converted without knowing, which encoding the byte array in Lua uses? |
That's precisely the problem. There is a mismatch between CLR strings and Lua strings. There needs to be a compromise on how that's handled somewhere. If all Lua strings get extracted into the CLR as byte arrays, a significant amount of convenience is lost -- you have to manually convert any time you want to use a Lua string in the CLR world. Perhaps the best solution would indeed be to make |
That would be good! So we could have the original data, but for all normal cases the byte array is converted automatically to a correct CLR Unicode string. That's fine :) |
This change is going to be fairly invasive, so it may not be ready soon. I effectively need to replace every Lua library API that accepts a string to accept a Note that some Lua APIs ( Basically I need to audit every Lua API that accepts a string and either ensure that it will behave correctly, or replace it with something else. |
Hi, I'm jumping in the thread to second charlenni in thanking you for the great lib.
That sounds extreme, but I understand it might be necessary to implement the change correctly. But what about the solution you described earlier...
What about this, simply as an optional (conditional compilation) feature of the lib? That would be more hacky and less generic, but maybe easier to implement and less invasive for general users. |
Perhaps Mangatome is right. It's a special use case. Normally all (99,999%) users of Eluant will transfer "normal" strings to and from Lua. To transfer a string, which isn't printable, is a very special use case. I found this only once, when someone tries to encrypt strings by a special string, which contains the given bytes. So a conditional compilation would be ok. Others be aware of the string transfer problem and could use it in their code. All others use the code as is. It works, how said above, in nearly all cases. |
The approach I'm taking now is to add a When marshalling a Lua string to the CLR, I am going to pull the byte-string out of Lua and use the same constructor internally. A new Then, the first time that a The I think this approach will solve all of the conversion issues. In hindsight, performance should be either equal or better, as the string conversion still happens only once, just in a different place -- and it might not even happen at all, if only Memory requirements may be increased a bit but I think this is an acceptable compromise to properly resolve this issue. |
That sounds great. It's not to difficult and there are no changes top the code for the others. And yes, the exception should be ok, because normally you expect, that the string is printable. Thank you for your help. |
I have pushed a fix for this issue to the lua-string-marshal branch. There are new tests to verify the new behavior. Please test and let me know if you have any issues. @charlenni I used a modified version of your testcase and it's working. (Your test can be rewritten as simply |
@charlenni Have you been able to test the fix yet? |
Sorry for the late answer. I had the last two weeks no time to test it. Yesterday I changed my code in the way you did it. And yes, it works. Now saving and loading again works. Thank you very much. @ALL: Be carefull. If you use string.dump() to save a function from Lua, you could get the same problem with strings. |
Thanks. I will get the branch merged soon. |
See also OpenRA#3 which seems to address a very similar problem. |
Hi,
after two years of testing, I had to say, that Eluant works very, very stable. Good job and thank you for sharing it with the community 👍
Now I have the problem, that strings in Lua and strings in C# are different and so there could be problems while converting them from Lua to C#. It would be helpfull, if we could retrive strings as byte arrays.
I encountered the problem, that strings, which are a byte array in Lua could not transfered to C#, saved, loaded again and transfered to Lua. The transfer to Lua doesn't work. The string is created in Lua:
This is a valid string in Lua, which I want to transfer to C# (to save/load it later). But if I now try to bring this string back to Lua, I get problems. The string isn't the same in Lua as before. Perhaps it has something to do with the problem in this thread: http://lists.ximian.com/pipermail/mono-devel-list/2006-March/017493.html.
So it would be nice to get/set a string in Lua by a byte array (ToArray, FromArray) instead of a string, so that no conversions take place.
Best regards,
Dirk
The text was updated successfully, but these errors were encountered: