Distributed query engine "Presto" 's client library for node.js.
var presto = require('presto-client');
var client = new presto.Client({user: 'myname', catalog: 'hive', schema: 'default'});
client.execute('show schemas', function(error, data, columns){
console.log({databases: data});
});
For queries with long process time and heavy output:
var presto = require('presto-client');
var client = new presto.Client({user: 'myname'});
client.execute({
query: 'SELECT count(*) as cnt FROM tblname WHERE ...',
catalog: 'hive',
schema: 'default',
state: function(error, query_id, stats){ console.log({message:"status changed", id:query_id, stats:stats}); },
columns: function(error, data){ console.log({resultColumns: data}); },
data: function(error, data, columns, stats){ console.log(data); },
success: function(error, stats){},
error: function(error){}
});
npm install -g presto-client
Or add presto-client
to your own packagen.json
, and do npm install
.
Instanciate client object and set default configurations.
- opts [object]
- host [string]
- presto coordinator hostname or address (default: localhost)
- port [integer]
- presto coordinator port (default: 8080)
- user [string]
- username of query (default: process user name)
- catalog [string]
- default catalog name
- schema [string]
- default schema name
- checkInterval [integer]
- interval milliseconds of each RPC to check query status (default: 800ms)
- jsonParser [object]
- custom json parser if required (default:
JSON
)
- custom json parser if required (default:
- host [string]
return value: client instance object
If 2nd argument callback
specified, this api will be selected.
This is an API to execute queries that returns result immediately, like show schemas
, show tables
and others. (Using "/v1/execute" HTTP RPC.)
Execute query on Presto cluster, and fetch results.
- arg [Object or string]
- arg [String]: query string executed
catalog
andschema
must be specified innew Client()
for this argument type
- arg [Object]
- query [string]
- catalog [string]
- catalog string (default: instance default catalog)
- schema [string]
- schema string (default: intance default schema)
- session [string]
- set session variables via the X-Presto-Session header - string should have form
key1=val1,key2=val2
- set session variables via the X-Presto-Session header - string should have form
- timezone [string :optional]
- set time zone via X-Presto-Time-Zone header
- callback [function(error, data, columns)]
- called once when query finished
- data
- array of arrays of each field values
[ [ 'field1Value', 'field2Value', 3 ], [ 'field1Value', 'field2Value', 6 ], ... ]
- columns
- array of field names and types
[ { name: 'timestamp', type: 'varchar' }, { name: 'username', type: 'varchar' }, { name: 'cnt', type: 'bigint' } ]
This is an API to execute queries that really read large amount of data. (Using "/v1/statement" HTTP RPC.)
Execute query on Presto cluster, and fetch results.
Attributes of opts [object] are:
- query [string]
- catalog [string]
- schema [string]
- timezone [string :optional]
- info [boolean :optional]
- fetch query info (execution statistics) for success callback, or not (default false)
- cancel [function() :optional]
- client stops fetch of query results if this callback returns
true
- client stops fetch of query results if this callback returns
- state [function(error, query_id, stats) :optional]
- called when query stats changed
stats.state
: QUEUED, PLANNING, STARTING, RUNNING, FINISHED, or CANCELED, FAILED
- query_id
- id string like
20140214_083451_00012_9w6p5
- id string like
- stats
- object which contains running query status
- called when query stats changed
- columns [function(error, data) :optional]
- called once when columns and its types are found in results
- data
- array of field info
[ { name: "username", type: "varchar" }, { name: "cnt", type: "bigint" } ]
- data [function(error, data, columns, stats) :optional]
- called per fetch of query results (may be called 2 or more)
- data
- array of array of each column
[ [ "tagomoris", 1013 ], [ "dain", 2056 ], ... ]
- columns (optional)
- same as data of
columns
callback
- same as data of
- stats (optional)
- runtime statistics object of query
- success [function(error, stats, info) :optional]
- called once when all results are fetched (default: value of
callback
)
- called once when all results are fetched (default: value of
- error [function(error) :optional]
- callback for errors of query execution (default: value of
callback
)
- callback for errors of query execution (default: value of
- callback [function(error, stats) :optional]
- callback for query completion (both of success and fail)
- one of
callback
orsuccess
must be specified
Callbacks order (success query) is: columns -> data (-> data xN) -> success (or callback)
Get query current status. (Same with 'Raw' of Presto Web in browser.)
- query_id [string]
- callback [function(error, data)]
Stop query immediately.
- query_id [string]
- callback [function(error) :optional]
Get node list of presto cluster and return it.
- opts [object :optional]
- specify null, undefined or
{}
(currently)
- specify null, undefined or
- callback [function(error,data)]
- error
- data
- array of node objects
Javascript standard JSON
module cannot handle BIGINT values correctly by precision problems.
JSON.parse('{"bigint":1139779449103133602}').bigint //=> 1139779449103133600
If your query puts numeric values in its results and precision is important for that query, you can swap JSON parser with any modules which has parse
method.
var JSONbig = require('json-bigint');
JSONbig.parse('{"bigint":1139779449103133602}').bigint.toString() //=> "1139779449103133602"
// set client option
var client = new presto.Client({
// ...
jsonParser: JSONbig,
// ...
});
- 0.1.3:
- add X-Presto-Time-Zone if "timezone" specified
- 0.1.2:
- add X-Presto-Session if "session" specified
- 0.1.1:
- fix bug not to handle HTTP level errors correctly
- 0.1.0:
- add option to pass customized json parser to handle BIGINT values
- add check for required callbacks of query execution
- 0.0.6:
- add API to get/delete queries
- add callback
state
on query execution
- 0.0.5:
- fix to do error check on query execution
- 0.0.4:
- send cancel request of canceled query actually
- 0.0.3:
- simple and immediate query execution support
- 0.0.2: maintenance release
- add User-Agent header with version
- 0.0.1: initial release
- node: "failed" node list support
- patches welcome!
- tagomoris
- License:
- MIT (see LICENSE)