Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL library code #18

Closed
ForbesLindesay opened this issue May 27, 2019 · 7 comments
Closed

SQL library code #18

ForbesLindesay opened this issue May 27, 2019 · 7 comments

Comments

@ForbesLindesay
Copy link
Owner

Hi @calebmer - I can't keep track of a discussion in a large commit, so if you want to discuss it, let's talk here.

Regarding the source of the code, the initial version is loosely based on the pg-sql implementation. I'd be happy to provide some sort of attribution if you'd like? I put this implementation into this code base for a few reasons though:

  • I wanted to use TypeScript for it natively, to match the rest of the code base
  • I wanted to also support generating slightly different output for different SQL dialects (not all of them support the explicit $1 style placeholder syntax)
  • The various db libraries here are very tightly integrated with that package, so I need to be able to manage version updates carefully.
  • I felt it was important that sql.raw had a value that was more obviously dangerous.

Regarding the symbols in pg-sql2, this library uses instanceof, which is equivalently secure - you can't create an instance of SQLQuery via JSON.parse.

Let me know if you have any other concerns/if you want specific attribution.

@calebmer
Copy link

I don’t have any concerns and I don’t care about attribution. It brought me joy to randomly stumble upon some code I wrote a while ago in a new context 😀

Using template strings to prevent SQL injection while also providing a good DX is a great idea and I’m happy to see you using it! If you popularize the idea, I’ll be super excited.

I hope my comment didn’t sound salty, my intention was to raise my hand and give you a high five.

@calebmer
Copy link

calebmer commented May 27, 2019

While we’re here, though, since I use pg-sql a lot (well, mostly my unpublished rewrite) I might as well mention some of my stretch goals for a library like this:

  • A Babel plugin which performs query minification at build-time.
  • A Babel plugin which inserts source code location as a comment in the query. This can make queries easier to debug when you look at the database statement logs.
  • Some kind of Babel/TypeScript plugin that introspects my Postgres schema to generate accurate types for the query. I have a typed database (Postgres) and a typed language (TypeScript) so it makes me sad that I have an untyped bridge in-between. I don’t want an ORM, I want the flexibility of writing SQL.

I also only care about Postgres which is a different goal then what you have.

@ForbesLindesay
Copy link
Owner Author

✋ high five to you too 😄

Glad to hear you're also happy, I was just slightly concerned you might have been pissed about me just copy-pasting bits of your code. People differ on what they're happy with in that area.

The babel plugin for minification is interesting, and definitely not on my radar.

I'm definitely interested in what we can do to improve error messages/improve the debugging experience. Most of

const q = query.compile(
process.env.NODE_ENV !== 'production' ? {minify: false} : undefined,
);
try {
return await this.connection.query(q);
} catch (ex) {
if (isSQLError(ex) && ex.position) {
const position = parseInt(ex.position, 10);
const match =
/syntax error at or near \"([^\"\n]+)\"/.exec(ex.message) ||
/relation \"([^\"\n]+)\" does not exist/.exec(ex.message);
let column = 0;
let line = 1;
for (let i = 0; i < position; i++) {
if (q.text[i] === '\n') {
line++;
column = 0;
} else {
column++;
}
}
const start = {line, column};
let end: undefined | {line: number; column: number};
if (match) {
end = {line, column: column + match[1].length};
}
ex.message += `\n\n${codeFrameColumns(q.text, {start, end})}\n`;
}
throw ex;
}
is about improving the error message display. We could probably do even better if a babel plugin added line numbers and file names though.

I've been thinking a lot about trying to generate types from SQL queries. pg-schema contains some of the initial work I've done on making it easy to introspect the postgres schema. The challenges with generating the types from queries are:

  1. queries can be very dynamic, constructed out of lots of different bits of code. It's not that uncommon to see sql`SELECT ${sql.join(fields, ', ')} FROM ${sql.ident(tableName)};`. There's no way we're going to be able to extract proper types from that.
  2. even if you only consider the straightforward static queries (with placeholders for values), you still have to parse the query and understand it at a pretty deep level.

My current thinking is to generate types for each table, and help people use things like Pick<...> to get types that match their queries. I'm also building a super primative "ORM" that lets you call methods for .get, .list, .create, .update and .delete on each table, but people would fall back to SQL for anything more complex. Ideally I'd still prefer to write SQL and have the types just happen by magic, that's what I do with GraphQL, but GraphQL is a much simpler language.

@calebmer
Copy link

For Postgres introspection in https://yarn.pm/postgraphile here’s the ~250 line query I wrote for introspection:

https://github.com/graphile/postgraphile/blob/7c68e8164822dabca5c33076a73b6f8fdc9e19a3/resources/introspection-query.sql#L1-L5

Along with a companion class for easily accessing the catalog in JS:

https://github.com/graphile/postgraphile/blob/7c68e8164822dabca5c33076a73b6f8fdc9e19a3/src/postgres/introspection/PgCatalog.ts#L13

(I’m linking to v3 since that’s when I stopped maintaining PostGraphile, cc @benjie who’s been doing an incredible job ever since! 😀)

I started writing a typed query builder for Postgres with the built-in ability to add privacy constraints https://gist.github.com/calebmer/99127fee6eb645587d8efee69f0b09db

I ended up ditching that for a pg-sql with Row-Level Security solution since building a performant fully typed query builder is a lot of work for the app I was working on.

I think you’re right that building a properly typed pg-sql is too difficult given we don’t control the full language compiler. A typed query builder is totally doable, though! Then it’s a matter of faithfully replicating all the features in the Postgres query language.

Diesel for Rust is really my inspiration for this. The dream is that a sufficiently powerful compiler can remove the abstraction levels of a query builder and leave you with an artifact that’s really close to SQL. Rust is probably a lot closer in that respect.

@benjie
Copy link

benjie commented May 29, 2019

I'm not anywhere near as much of a type ninja as @calebmer; but feel free to ask me any questions I might be able to help with. I've been considering building a query builder on top of pg-sql2 for a while (primarily to give PostGraphile users a more comfortable way to write SQL expressions than using pg-sql2 directly), and type safety is one of the things I'd like to enable. I've not gone very far down this path yet though, due to lack of time.

@ForbesLindesay
Copy link
Owner Author

I think I'll go down the path of being able to generate types for some growing subset of sql statements. I think falling back to un-typed SQL from typed SQL is friendlier than falling back to un-typed SQL from our own DSL style API.

I think a library that understood/validated/added types for basic INSERT, UPDATE, DELETE and simple SELECT ... FROM ... style queries would be hugely useful, even if more complex queries were just typed as unknown. I think a very minimal SQL parser could get something like 60% of the queries in my current code base to be typed. Unfortunately that's still going to be a lot of work to build, so I don't know when I'll get time.

@ForbesLindesay
Copy link
Owner Author

I've got a primitive SQL parser just about working: #20

Next steps:

  1. create a proper AST for it
  2. write something to generate types from the AST + a database schema
  3. use the resulting utility to generate and print out types from sql template literals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants