forked from JakeWheat/hssqlppp
-
Notifications
You must be signed in to change notification settings - Fork 0
/
bigplan
107 lines (76 loc) · 3.69 KB
/
bigplan
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
The big plan for sorting out a lot of the code.
The motivation is: how can the code be rewritten so that it isn't
embarrassingly bad (hopefully this is about code quality and not
ego...)
Rough overview:
overhaul the typechecking of identifiers
fix all the typechecking/ identifier/ catalog stuff so that it uses
names (not planning on respecting case sensitivity or schema
qualifications yet though).
work through and rewrite the test code, it's all a massive unreadable
unplanned mess. Want to make the tests much more orthogonal so each
test is more unit-y. (And when the syntax is updated, it doesn't
trigger mass rewriting all over the tests.)
get the chaos sql typechecking to the previous standard (back in
version 0.0.5 or something), clean up the use of h7c, and get the
documentation for this generating properly again
other possible ideas for next release:
typesafe wrapper, use for chaos tests, get chaos actually running also
type checking flow, inferred type ish
switch ag to haskell style and bird lag to match the other code
provide some more hacking examples and support in the code:
add a new ctor to existing node
add a new kind of node
include simple instructions to fail gracefully at typechecking, so new
syntax plus parsing and pretty printing can be easily added
review parsing code and try to make it clearer for the confusing bits,
also the lexer
= current work: typechecking fixes:
do syntax fix first - into, split plpgsql statement types from regular
sql?
get rid of inftype in annotation
get rid of almost all the typechecking code completely
adjust the catalog to use names (and handle case properly, maybe
better error checking also?)
start with simple scalar expressions: get types, and type errors
then add fns and fnprototype annotation
and ability to get type of ? via attrs
then start on queryexprs:
simple trefs, simple select lists
* expansion
build up to all trefs
other query expr parts ...
then dml
then have ddl and plpgsql to tackle
get to stage where chaos can successfully type check
will be a lot more functional that current system at this point
concrete steps:
2) rewrite the parse tests to be readable
3) kill the typechecking, and start again with better annotation
4) work through the typechecking tests getting the new system working
5) look at typechecking chaos
Details
The big ugly bit that is bothering me at the moment is the
fixupidentifiers/ typechecking. There are limitations to the current
approach, plus it's a bit weird.
The IDEnv code used for the fixup identifiers shows the way
forward. This will be extended to support types and typechecking in
the same style. Then the separate fixup identifiers pass can be
eliminated. The basic concept is that the IDEnv is a cut down tree
structure (an abstract ast?) which mirrors how the current environment
has been constructed, and then this can be queried to get the type/
qualifier/ star expansion out, and flag ambiguities, etc.
The only tree rewrite that fixup identifiers does that will stay in
the new system is expanding * in select lists and this will not be
optional when typechecking. The new typechecker will rewrite the * in
select lists when typechecking the select list.
The other rewrites will be optional during typechecking:
Adding full aliases to trefs where possible, and to subqueries - so
you get the column names as well as the 'table' name for the subquery.
Adding the qualifiers to identifier references explicitly.
does something wrong to handle * in aggregate calls, use a better fix
which doesn't rewrite the star
adds explicit aliases to select items in a select list
e.g. 'count(*)' is rewritten to 'count(*) as count'
Most of these are mainly useful for code which wants to
query/transform the ast.