-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathBENEFITS-LUA
808 lines (634 loc) · 33.3 KB
/
BENEFITS-LUA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
BENEFITS-LUA
============
What do I get when I require "fiveq"?
----------------------------------------
When you require "fiveq" in Lua 5.1, the following functions are modified:
pairs and ipairs:
will honor the __pairs and __ipairs metamethods
At one point in development of Lua 5.2, ipairs was deprecated; also
at one point, ipairs always ran until #tbl, which honors __len and
might be after the first nil anyway. Both of these changes were
retracted by the final release of Lua 5.2, and are ignored here.
os.execute and io.popen:
os.execute and the close method of the object returned by io.popen will
return a triple, as in Lua 5.2. The first value is true if the close was
successful, else nil. The second value is "exit" if the process terminated
normally, else "signal". The third value is the exit code or signal that
terminated the process.
Additionally, os.execute() with no arguments returns true if /bin/sh is
available, else false.
Examples:
$ lua-5.1 -e 'os.exit(3)'; echo $?
3
$ lua-5.1 -e 'print(os.execute([[lua-5.1 -e "os.exit(3)"]]))'
768
$ lua-5.2 -e 'print(os.execute([[lua-5.1 -e "os.exit(3)"]]))'
nil exit 3
$ perl -e 'print(system(qq(perl -e "exit(3)")),"\n")'
768
$ python -c 'import os; print(os.system("python -c \
\"import os; os._exit(3)\""))'
768
> f = io.popen "ls"
> list = f:read "*a" -- read all the output
> print (#list)
143
> = f:close()
true exit 0
io.open:
will sanitize its mode string against "[rwa]%+?b?"
io.read and file:read:
will accept "*L" argument to preserve newlines
io.lines and file:lines:
will pass their arguments through to read (defaults to "*l")
file:write:
will return file
load:
will accept a string argument, like `load` in 5.2; ignores any
third `mode` argument, but will honor any fourth `env` argument, using
setfenv
xpcall(f, handler, args...)
accepts args for f, like `pcall` and `xpcall` in Lua 5.2
string.rep(s, nrep, [sep])
accepts an optional separator argument
math.log:
accepts an optional `base` argument, uses efficient algorithm for base 2
coroutine.running:
returns a second boolean result indicating whether the calling thread
is the main thread. If it is, the first result is nil in Lua 5.1 but
the main thread object in Lua 5.2.
The following functions are added:
rawlen:
returns length of table or string, ignoring any __len metamethods;
in unpatched Lua 5.1, __len is always ignored anyway for these, so
this function gives the same result as #tbl or #str
table.pack:
has same behavior as Lua 5.2's
table.unpack:
alias for global unpack
package.searchers:
alias for package.loaders
package.searchpath(base, path, [".", "/"])
searches supplied path for a readable file in same way require does,
replacing any "." in base with "/"
debug.getuservalue:
returns nil if argument isn't a full userdatum, else nil or a table
debug.setuservalue:
raises error if arguments aren't a full userdatum and either nil or a
table
Additionally, a bit32 library is provided, with the same behavior as Lua 5.2's.
Arguments are numerically converted from lua_Numbers to unsigned int32s.
bit32.bnot(0) is 0xffffffff which is > 0.
Functions included: band, bor, bxor, bnot, btest,
lshift, rshift, arshift, lrotate, rrotate, replace, extract
btest: is band of all arguments nonzero?
shifts and rotates also honor negative displacements (lshift->rshift)
extract(base, bitstart, [bitwidth]) -- bits numbered 31..0, width counts <<
replace(base, new, bitstart, [bitwidth]) -- where new is masked from the
right to needed width
Finally, package.loaded.fiveq is set to true.
The following features were first documented in Lua 5.2, but are already present
in the basic Lua 5.1 implementation:
%f[set] "frontier" patterns:
%f[%w]word%f[%W] will also match at start/end of string
select(index, ...):
honors negative index; always returns from index until end
string.sub(s, start, stop) and string.byte(s, start, stop):
after translating negative start/stop, clip them to range 1..#s
io.lines and file:lines:
raise errors instead of returning nil,msg
collectgarbage:
when called with no arguments, defaults to "collect"
package.config
For features that aren't backported from Lua 5.2, see the accompanying file
LIMITATIONS.
When you require "fiveq" in Lua 5.2, no functions are modified. The following
functions are added:
loadstring:
alias for load
(this is already exposed if you compiled Lua with LUA_COMPAT_LOADSTRING)
unpack:
alias for table.unpack
(this is already exposed if you compiled Lua with LUA_COMPAT_UNPACK)
table.maxn
(this is already exposed if you compiled Lua with LUA_COMPAT_MAXN)
math.log10
(this is already exposed if you compiled Lua with LUA_COMPAT_LOG10)
package.loaders:
alias for package.searchers
(this is already exposed if you compiled Lua with LUA_COMPAT_LOADERS)
module and package.seeall:
(these are already exposed if you compiled Lua with LUA_COMPAT_MODULE,
though our version of module is more reliable and replaces the
native one)
newproxy(arg) and debug.newproxy(arg)
returns a new userdatum
if arg is false: with no metatable
if arg is an existing userdatum created by newproxy: use its metatable
if arg is (literally) true: with a new empty metatable
if arg is a function: with a new metatable using arg as __gc
debug.getfenv and debug.setfenv:
only available for userdata
(see also the global getfenv provided under fiveqplus, below)
Additionally, package.loaded.fiveq is set to true.
What do I get when I instead require "fiveqplus"?
---------------------------------------------------
You get all the modifications and additions from fiveq, with the following
changes:
package.loaded.fiveqplus is set to true; and
package.loaded.fiveq is set to "plus"
an additional function math.trunc is provided
Lua 5.2's math.log is replaced by a version that uses an efficient
algorithm for base 2 (this is already used for fiveq in Lua 5.1)
two additional functions are added to the string library:
string.starts(str, prefix, ...)
string.ends(str, suffix, ...)
returns the first of the prefixes (or suffixes) with which str begins
(or ends), or false if none of them match. The prefixes (or suffixes)
are interpreted literally; regex is not honored.
The same behavior can be achieved in pure Lua, but these routines
are efficiently coded in C.
a third additional function is added to the string library:
string.gsubplain(str, target, replacement, [howmany])
this works just like gsub, except that target is interpreted
literally; regex is not honored. Returns a copy of string with
up to howmany replacements made, plus a count of the actual
number of replacements.
For large strings, and in cases where regex is not needed, this function
is substantially faster than gsub.
Adapted from http://lua-users.org/wiki/StringReplace by Sam Lie.
Lua 5.1's newproxy and debug.newproxy are replaced by a version with the
same interface as is used for fiveq in Lua 5.2
an additional global function `singles` is provided that works somewhat
like pairs and ipairs, but returns iteration sequences with only
a single column: that is, only a single value at each step, like the
sequences generated by io.lines.
It's used as follows:
singles(obj, [factory])
will first check obj for a __singles metamethod, which should return a
function and optionally two further values, just as __pairs and
__ipairs do. The function should generate a single-columned
iteration sequence when called with those values. (Nothing is done
here to enforce only a single column, though.)
If obj has no __singles metamethod, what will instead be returned is a
specially-wrapped version of the function and values returned by
factory(obj). The wrapper clips whatever iteration sequence those
generate to a single column.
If factory is omitted, it defaults to pairs.
For example, the program:
for k, v1, v2 in singles {one=1, two=2, three=3} do
print(k, v1, v2)
end
might produce the output:
two nil nil
one nil nil
three nil nil
With explicit for-loops, the same effect can be achieved by just
omitting or ignoring variables for all but the first member in the
sequence. However, there are other situations where one wants to
pass a sequence-generating function and values to higher-order
functions, and there it can be useful to clip iteration sequences
to a single column.
Additionally, some data structures are more naturally iterated over
using single-column sequences. For example, I prefer to write my queues
and sets with __singles rather than __ipairs metamethods. (__ipairs
implies a kind of stability with respect to the index, and the
possibility for random access, that the iterations I'm exposing do
not supply.)
io.lines and file:lines() can be thought of as a specific application
of singles(...). (But we don't in fact add an __singles metamethod to
the metatable for files.)
an additional function debug.getmetafield is provided that works like this:
debug.getmetafield(obj, "field")
if obj is a table or userdata with a metatable, and the metatable has
key "field", this will return the value associated with that key.
Compare: bool luaL_getmetafield(L, idx, "field").
A special whitelist of fields are handled even if obj's metatable was
made opaque by giving it a __metatable metamethod. Currently the
whitelist is:
__len
__eq
__index
__tostring
__copy
__reversed
The point is that one may want the opaque metatables, for
better encapsulation, but may also want to write generic functions in
Lua that should honor these metamethods. (As I do in my personal
Lua libraries.)
in Lua 5.2, an additional global function getfenv is provided, that
provides some of the functionality of the global function of the
same name in Lua 5.1. When called with argument 0, it returns the
global table (even if that's not available through any local binding).
When called with an argument n > 0, it returns the environment of the
function at stack level n. The function calling getfenv is at stack
level 1.
Note that in 5.1 but not 5.2, tail calls increment the "level".
unpack and table.unpack look for the number of elements at key "n", and if
that's not present, fall back to querying the object's length in a way
that honors __len metamethods
(For consistency, I wish they had used key "#" for table.pack, instead of
"n"---even though it's harder to write tbl["#"] than to write tbl.n. If
your preferences are like mine, you can effect this change under fiveqplus
by compiling with -DLUA_FIVEQ_ALTPACK.)
error is modified so as to accept more than two arguments:
error(fmt, level, args ...)
Its first argument is used together with the third and subsequent
arguments to construct an error message using string.format. Its second
argument continues to specify the stack level to which to attribute the
error, with level 1 being the function that calls error. Use 0 to
suppress the addition of any error position.
assert is modified so as to accept more than two arguments:
assert(test, fmt, args ...)
If its `test` argument is true-like, all arguments are returned; else
its second argument is used together with the third and subsequent
arguments to construct an error message using string.format.
an additional global function `check` is provided; this is somewhat akin
to assert:
check(test, arg#, fmt, args ...)
One difference is that check expects a second argument `arg#` that
identifies the position of the checked object in an argument list. If
check's `test` argument is true-like, then that argument (and it alone)
is returned, else the third argument is used as a format string
together with all subsequent arguments. For example:
> function example(arg) local x = check(arg, 1, "foo %s", "bar") end
> example(false)
stdin:1: bad argument #1 to 'example' (foo bar)
stack traceback:
[C]: in function 'check'
stdin:1: in function 'example'
stdin:1: in main chunk
[C]: ?
an additional library `err` is provided with more specific error/checking
functions to complement error, assert, and check; see description of
## err library ## below
an additional library hash is provided; see description of
## hash library ## below
an additional library struct is provided; see description of
## struct library ## below
an enhanced module system is provided, comprised of the functions
require, module, package.seeall, and package.strict. The first
replaces the original require function, and the second and third
replace the corresponding original functions in Lua 5.1.
(They are also present in Lua 5.2 if compiled with LUA_COMPAT_MODULE.)
See discussion of ## module system ## below.
Err library
-----------
For reference, we reproduce the signatures of error, assert, and check in
fiveqplus:
error(fmt, level, args ...)
assert(test, fmt, args ...)
check(test, arg#, fmt, args ...)
The err library adds the following:
err.istype(obj, types ...)
returns the first of the type arguments that obj satisfies, or false
if it satisfies none of them. The type arguments can be any of:
* the standard return values from type(obj): "nil", "table",
"boolean", and so on. Both full and light userdata satisfy
"userdata". For "string" and "number", obj merely has to be
convertible to the type
* "positive" if obj is convertible to a positive integer
* "natural" if obj is convertible to a non-negative integer
* "negative" if obj is convertible to a negative integer
* "string!" if obj is literally a string; whereas "string" merely
requires obj to be convertible to a string
* "number!" if obj is literally a number, not just convertible to
one
* "integer!" if obj is literally an integer-valued number
(plain "integer" is also available, but anything that satisfies
"number" also counts as convertible to an integer, by truncation)
* "positive!" if obj is literally a positive integer
* "natural!" if obj is literally a non-negative integer
* "negative!" if obj is literally a negative integer
* "callable" if obj is a function or has a function for its __call
metamethod
* "indexable" if obj is a table or has a table or function for
its __index metamethod
* "iterable" if obj is a table or has an __singles or __pairs
metamethod
* "iterator" if obj is a function: the idea is that
istype({}, "iterator") returns false but
istype(pairs({}), "iterator") returns true. The same effect
can be achieved with "function", but this documents the
programmer's intentions more clearly.
* any typeobject: obj will count as satisfying this type if either
obj's metatable or its __type metamethod equals the typeobject
err.checktype(obj, arg#, [expected index], types ...)
a hybrid of check and istype: if object satisfies any of the type
arguments, the first one it satisfies is returned. Else an error
is raised complaining that the function expected an object of type
so-and-so, but received one of type such-and-such. You can
explicitly specify the index of the type that was expected (the first
one having index 1); else 1 is assumed. Examples:
> function example(a, b, c) checktype(c, 3, "number", "string") end
> example(1, 2, false)
stdin:1: bad argument #3 to 'example' (number expected, got boolean)
stack traceback:
[C]: in function 'checktype'
stdin:1: in function 'example'
stdin:1: in main chunk
[C]: ?
> function example(a, b, c) checktype(c, 3, 2, "number", "string") end
> example(1, 2, false)
stdin:1: bad argument #3 to 'example' (string expected, got boolean)
stack traceback:
[C]: in function 'checktype'
stdin:1: in function 'example'
stdin:1: in main chunk
[C]: ?
err.checkopt(obj, arg#, type, [default])
asserts that obj satisfies the single type argument, or else is nil;
if the assertion fails, raises an error of the same format as checktype.
if the assertion succeeds and obj is nil, returns default; else
returns obj
err.checkany(obj, arg#)
asserts that obj is of any non-nil type; if the assertion fails,
raises an error of the same format as checktype. if the assertion
succeeds, returns obj
err.arenil(args, ...)
returns true if all its arguments are nil
err.checkrange(num, arg#, [min], max)
err.checkrange(num, arg#, [min], -max)
asserts that num is a number in the inclusive range min...max;
min defaults to 1. If max is negative, its absolute value is used,
and additionally, num is also permitted to be -1, -2, and so on.
A value of -1 for num is converted to max; -2 is converted to max-1;
and so on. Raises an error of the same format as checktype if the
range constraints are violated; else returns the (possibly converted)
num.
err.badtype(obj, arg#, "foo")
manually raises an error of the same format as checktype. Example:
> err.badtype({}, 100, "foo")
bad argument #100 to '?' (foo expected, got table)
stack traceback:
[C]: in function 'badtype'
stdin:1: in main chunk
[C]: ?
err.bad(arg#, "extra message")
manually raises an error of the same format as check(false, arg#, "extra
message"). Example:
> err.bad(100, "foo")
bad argument #100 to '?' (foo)
stack traceback:
[C]: in function 'bad'
stdin:1: in main chunk
[C]: ?
Module system
-------------
We review the native behavior of "require" and "module", then describe our
changes.
* require "name"
Native behavior: Retrieve package.loaded[name]; if it doesn't exist
tries to load it using routines in package.loaders/searchers:
[1] if there is function at package.preload[name], call it
with the argument name
[2] else if there is a name.lua (or name/init.lua) in
LUA_PATH/package.path, load it and call the resulting chunk with
the argument name (and in Lua 5.2, a second argument of
"/path/to/name.lua"). If name is of the form "a.b.c",
/path/to/a/b/c.lua (and /path/to/a/b/c/init.lua) will be used.
[3] else if there is a name.so (or name/init.so) in
LUA_CPATH/package.cpath, invoke its luaopen_name function with the
argument name on the stack (and in Lua 5.2, a second argument of
"/path/to/name.so"). If name is of the form "prefix-a.b.c",
/path/to/prefix-a/b/c.so and luaopen_a_b_c will be used.
[4] else where name has the form "prefix-a.b.c", look for prefix-a.so
(or prefix-a/init.so) in LUA_CPATH/package.cpath, invoke its
luaopen_a_b_c function with the argument name on the stack (and in
Lua 5.2, a second argument of "/path/to/prefix-a.so")
If a function was found and invoked, and its return value is non-nil,
that value is assigned to package.loaded[name].
If the return value is nil: (in Lua 5.1) if package.loaded[name]
wasn't written to at all, we write "true" there; (in Lua 5.2) we also
write "true" *whenever* package.loaded[name] is nil, even if that
was written there explicitly.
We return the final value of package.loaded[name].
Notes:
* given [1], the behavior of `require "name"` when the contents of
file name.lua in the LUA_PATH are "blah blah" is the same as its
behavior when package.preload.name = function(...) blah blah end.
(save that in the former case in Lua 5.2, the chunk will also be passed
the filepath as a second argument).
* require "name" doesn't itself modify _G or _ENV, but the library chunk/
luaopen_name function may do so. Once package.loaded[name] is set,
require "name" will always return that value without executing
any other code or modifying any environment.
* when libraries a and a.b both exist, requiring one of them does not
automatically require the other.
fiveqplus replaces "require" with a new version with the following
enhancements:
require "name" -- native behavior
require("name", tbl) -- merges all keys from the returned table
that don't begin with "_" into tbl
require("name", tbl, "key1", "key2") -- merges only key1 and key2
require("name", "key1", key2") -- merges only key1 and key2, into
current environment instead
* module(name, [...additional decorators...])
Native behavior: Use luaL_pushmodule to find a new or existing table from
package.loaded[name] or _G[name]. Set fields _M, _NAME, and _PACKAGE in
this table unless _NAME already exists. Make this table the environment of
the current chunk.
Roughly equivalent code (though note that for "a.b.c", it will be
_G["a"]["b"]["c"] that's checked, not _G["a.b.c"]):
local _M = package.loaded[name] or _G[name] or {}
if not package.loaded[name] then
if not _G[name] then _G[name] = _M end
package.loaded[name] = _M
end
if not _M._NAME then
_M._NAME = name
_M._PACKAGE = string.gsub(name, "[^.]*$", "")
_M._M = _M
end
setfenv(1, _M) -- in 5.2 this is done by assigning to the first local
-- or upvalue named _ENV
Note that if you set package.loaded[name] = {} before invoking
module(name), _G won't be written to
Lua 5.2 provides a version of module and package.seeall if compiled with
LUA_COMPAT_MODULE, however this version just blindly assumes that
_ENV is upvalue #1, which is not always true. We replace with
the fiveq version, which more explicitly checks which local or upvalue
is named _ENV.
where necessary, fiveqplus queries and writes to *the caller's environment*
(in Lua 5.2, _ENV) instead of _G. You can suppress this behavior in 5.2,
by setting the caller's environment to nil when the module is required:
local n
local require = require
do
local _ENV = nil
n = require "name"
end
assert(n and not _ENV.name and not _G.name)
fiveqplus also does some behind-the-scenes
magic to cooperate with fiveqplus's enhanced package.seeall.
Our code is adapted from the Lua sources and
<http://lua-users.org/wiki/ModuleDefinition>. It addresses most
of the concerns raised at
<http://lua-users.org/wiki/LuaModuleFunctionCritiqued>.
See the accompanying file test/module.lua for more details.
* package.seeall
The default behavior of this decorator in 5.1, and in 5.2 if fiveq or
if compiled with LUA_COMPAT_MODULE, is just to:
setmetatable(_M, {__index = _G})
fiveqplus exposes an enhanced version: inside the module, the caller's
environment (which need not be _G) is wholly visible; but
it will *not* be visible from outside functions indexing the module.
Only variables written to the module's environment will be visible
externally. For example:
function make_alpha()
module("alpha", package.seeall)
x = true -- write to module's environment
print(x) -- here print is visible
end
make_alpha()
print(alpha.x, alpha.print) -- will be true, nil
* package.strict
This is another decorator provided by fiveqplus, that can be used instead
of package.seeall.
Based on etc/strict.lua in the Lua 5.1 sources, and code from
<http://lua-users.org/wiki/ModuleDefinition>.
Only when a variable has been written from the module's toplevel (even
writing nil suffices) can it be read or written to anywhere else in the
module. For example:
function make_alpha()
module("alpha", package.strict)
x = true
y = nil
function beta()
print(x, y) -- will succeed
x, y = false, false -- will also succeed
print(z) -- would fail
z = false -- would also fail
end
end
Hash library
------------
A data structure is "extensional" when its identity should depend only on the
identities of its components, and their position in the structure. So, for
example, Lua tables are not extensional because {10, 20} and {10, 20} are two
different tables. On the other hand, immutable sets and tuples should be
extensional. Any immutable set containing exactly the members 10 and 20 should
be identical to any other.
It's easy enough to define sets in terms of tables (we can use {[10]=true,
[20]=true}), and for many purposes we don't need to bother about their
identity. Where we do need to bother, it may sometimes suffice to give sets an
__eq metamethod that declares two sets equal just when they have the same
elements. (And likewise for immutable tuples.) However, if we ever want to use
sets or any other extensional structure as table keys, this won't suffice. For
even if set1 and set2 count as equal by their __eq metamethod, Lua may still
see them as distinct objects, and tbl[set1] and tbl[set2] may give different
results.
The nicest solution would be if Lua honored a __hash metamethod, and counted
two objects as identical table keys if they generated the same hash and also
were equal according to __eq. But we don't have that.
The next best solution is to generically expose hash functions, and to intern
extensional objects by their hash as they're created. Preferably we intern
them in weak tables, so that they can be gc'd when no other code refers to them
anymore. But while such an object remains interned, then attempts to create
another __eq object should instead re-use the interned one. In this case, then, if
set1 and set2 did count as equal according to __eq, they will be the same Lua
object, and can be relied on to be equivalent table keys, too.
I have implemented this technique for immutable tuples and immutable
sets/multisets. I will provide those implementations separately. The fiveqplus
libraries just provides the general tools with which such systems can be built.
The `hash` library has the following methods:
* hash.tuple(...)
Provides an int hash of its argument sequence. Objects that Lua
counts as distinct (even ones with the same elements, such as
{} and {}) will generally produce different hashes. Permuting
arguments will also generally produce different hashes. And
hash.tuple() ~= hash.tuple(nil) ~= hash.tuple(nil, nil).
* hash.set(seed, value, [count=1])
Provides a hash of seed together with value-as-annotated-by-
count. Here the order in which values are added does not
affect the hash. That is:
hash.set(hash.set(seed0, value1), value2) ==
hash.set(hash.set(seed0, value2), value1)
If count is provided, it should be convertible to an integer
>= 1. The intent of the count annotation is to keep track of
the count of value in a multiset. But do not expect that:
hash.set(hash.set(seed0, value1), value1) ==
hash.set(seed0, value, 2)
In fact, since we achieve commutativity by using an xor-based
hash, the result of hashing value1 twice with the same
annotation will be as if it had never been added to the hash at
all.
* hash.unset(seed, value, [oldcount=1], [newcount=0])
If seed1 is the result of hash.set(seed0, value, oldcount)
then hash.unset(seed1, value, oldcount, newcount) will
be the same as hash.set(seed0, value, newcount)
or, if newcount == 0, to seed0 itself.
* hash.xor(string1, string2)
If the strings are of equal length, returns a third string
which is the result of xor-ing their bytes.
Taken from Roberto Ierusalimschy's md5lib.
* hash.unbox(obj)
If obj is of a gc-able type, return a light userdatum that
points to the same block of memory. For other objects,
return them unchanged.
Light userdata behave differently than gc-able objects with
respect to weak tables. Their use here is to serve as a
unique identifier for the object they're derived from.
A similar technique is used by the upvalueid functions in
Lua 5.2.
* hash.pstring(obj)
If obj is of a gc-able type, or is a light userdatum,
return the string that lua_pushfstring would return using
format "%p". For other objects, return nil.
Struct library
--------------
There are a number of libraries available for manipulating binary data in Lua.
Roberto Ierusalimschy's struct library
<http://www.inf.puc-rio.br/~roberto/struct/> is lightweight but still powerful,
and is the basis we build on here.
I made two major additions and some tweaks. One addition is a patch by Fleming
Madsen (though I didn't include his c0/c<nothing> change). The second addition
is exposing a "size" function to Lua, based on alien's implementation of
struct.c. See <http://lua-users.org/wiki/StructurePacking> for links to both.
The tweaks included cleaning up the behavior of c0, and inserting checks to
make sure we don't overflow the Lua C stack.
Here are some examples of how to use:
str = struct.pack (fmt, v1, v2, ...)
v1, v2, stop = struct.unpack (fmt, s, [start=1])
struct.size (fmt) -- fmt can't contain s or c0
Here are the formatting codes. Initially endianness is set to
native and alignment is set to none (!1).
">" use big endian
"<" use little endian
"!" use machine's native alignment
"!n" set the current alignment to n (a power of 2)
" " ignored
"x" padding zero byte with no corresponding Lua value
"xn" padding n bytes
"Xn" padding n align (default to current or native, whichever is smaller)
"b/B" a signed/unsigned char/byte
"h/H" a signed/unsigned short (native size)
"l/L" a signed/unsigned long (native size)
"i/I" a signed/unsigned int (native size)
"in/In" a signed/unsigned int with n bytes (a power of 2)
"f" a float (native size)
"d" a double (native size)
"s" a zero-terminated string
"cn" a sequence of exactly n chars corresponding to a single Lua string.
An absent n means 1. The string supplied for packing must have at
least n characters; extra characters are ignored.
"c0" this is like "cn", except that the n is given by other means: When
packing, n is the actual length of the supplied string; when
unpacking, n is the value of the previous unpacked value (which
must be a number). In that case, this previous value is not returned.
"(" stop capturing values
")" start capturing values
"=" current offset
More examples:
(1) To match:
struct Str {
char b;
int i[4];
};
in Linux/gcc/x86 (little-endian, max align 4), use "<!4biiii"
(2) To pack and unpack Pascal-style strings:
sp = struct.pack("Bc0", string.len(s), s)
s = struct.unpack("Bc0", sp)
In the latter command, the length (read by the element "B") is not returned.
(3) To pack a string in a fixed-width field with 10 characters padded with blanks:
x = struct.pack("c10", s .. string.rep(" ", 10))