bigplan

The big plan for sorting out a lot of the code.

The motivation is: how can the code be rewritten so that it isn't
embarrassingly bad (hopefully this is about code quality and not
ego...)

Rough overview:

overhaul the typechecking of identifiers

fix all the typechecking/ identifier/ catalog stuff so that it uses
names (not planning on respecting case sensitivity or schema
qualifications yet though).

work through and rewrite the test code, it's all a massive unreadable
unplanned mess. Want to make the tests much more orthogonal so each
test is more unit-y. (And when the syntax is updated, it doesn't
trigger mass rewriting all over the tests.)

get the chaos sql typechecking to the previous standard (back in
version 0.0.5 or something), clean up the use of h7c, and get the
documentation for this generating properly again

other possible ideas for next release:

typesafe wrapper, use for chaos tests, get chaos actually running also

type checking flow, inferred type ish

switch ag to haskell style and bird lag to match the other code
provide some more hacking examples and support in the code:
add a new ctor to existing node
add a new kind of node
include simple instructions to fail gracefully at typechecking, so new
  syntax plus parsing and pretty printing can be easily added
review parsing code and try to make it clearer for the confusing bits,
  also the lexer

= current work: typechecking fixes:

do syntax fix first - into, split plpgsql statement types from regular
sql?

get rid of inftype in annotation
get rid of almost all the typechecking code completely
adjust the catalog to use names (and handle case properly, maybe
  better error checking also?)
start with simple scalar expressions: get types, and type errors
then add fns and fnprototype annotation
and ability to get type of ? via attrs

then start on queryexprs:
simple trefs, simple select lists
* expansion
build up to all trefs

other query expr parts ...
then dml
then have ddl and plpgsql to tackle

get to stage where chaos can successfully type check
will be a lot more functional that current system at this point


concrete steps:
2) rewrite the parse tests to be readable
3) kill the typechecking, and start again with better annotation
4) work through the typechecking tests getting the new system working
5) look at typechecking chaos


Details


The big ugly bit that is bothering me at the moment is the
fixupidentifiers/ typechecking. There are limitations to the current
approach, plus it's a bit weird.

The IDEnv code used for the fixup identifiers shows the way
forward. This will be extended to support types and typechecking in
the same style. Then the separate fixup identifiers pass can be
eliminated. The basic concept is that the IDEnv is a cut down tree
structure (an abstract ast?) which mirrors how the current environment
has been constructed, and then this can be queried to get the type/
qualifier/ star expansion out, and flag ambiguities, etc.

The only tree rewrite that fixup identifiers does that will stay in
the new system is expanding * in select lists and this will not be
optional when typechecking. The new typechecker will rewrite the * in
select lists when typechecking the select list.

The other rewrites will be optional during typechecking:

Adding full aliases to trefs where possible, and to subqueries - so
you get the column names as well as the 'table' name for the subquery.

Adding the qualifiers to identifier references explicitly.

does something wrong to handle * in aggregate calls, use a better fix
which doesn't rewrite the star

adds explicit aliases to select items in a select list
e.g. 'count(*)' is rewritten to 'count(*) as count'

Most of these are mainly useful for code which wants to
query/transform the ast.