Skip to content

API Ideas

kindly edited this page Oct 14, 2020 · 5 revisions

The session

A session is the main object you are working with to connect with the database. It is not global as it should be possible to query multiple databases in a single script.

session = noteql.Session('schema_name')  # using env_var for the db uri
session = noteql.Session('schema_name', db_uri= 'postgresql://sqlachamydatabaseurlstring')
session = noteql.Session('schema_name', search_path= ['other', 'schemas'])
session = noteql.Session('schema_name', sql_file_location='/sql')

session.make_magic() # Allow %(%)sql magics to use same connection parameters as this session.

Basic session usage would be to run a query on the the data.

session.run_sql('select * from table', params=[]) # resulting table will display in notbook
results = session.get_results('select * from table') # results contain JSON style structured data of all data
 
for row in session.async_results('select * from table'):
    do_something_with(row) # get each row in result one at a time

Load local or remote data

session.load_json('example.json', path_to_list='people', table_name='import', field_name='person')
session.load_xml('example.xml', base_tag='people', table_name='import', field_name='person')
session.load_csv('example.csv', base_tag='people', table_name='import', field_name='person')
session.load_dataframe(my_pandas_dataframe, table_name='import';)

### Create intermediate view of the data.  This is done with simple select statements.

```python
session.create_view('view_name', sql='select * from table', params=[]) # resulting view created
session.create_table('table_name', sql='select * from table') # resulting table created
session.create_table('table_name', file='mysqlfile.sql', params=[]) # get sql file form sql directory specified
session.drop_view('view_name') # resulting view created

session.add_table_indexes('table_name', [id, (id, name')]) # list of indexes to be created a tuple is a joint index.
session.add_unique_indexes('table_name', [id, (id, name')]) # list of indexes to be created.

Analyse dependencies of tables/views created with the tool.

session.create_table('t1', sql='select * from table')
session.create_table('t2', sql='select * from t1')
session.create_table('t3', sql='select * from t2')

session.show_dependencies() #  Shows a graph showing all table dependencies  t1 <- t2 <- t3
session.drop_table(t2) #  Error will show saying that 't2 can not be dropped as t3 relies on it.
session.drop_table(t2, force=True) # will drop table t2 anyway

session.show_dependencies() #  Shows graph t1 <- t2 <- t3.   This keeps t2 in graph to show provenance even though the table has dropped.

session.drop_table(t2, cascade =True) # drops table and all dependancies. In this case ignore t2 as it has already been dropped but still drop t2.

session.show_dependencies() #  Shows graph of just t1

session.drop_all() # drop all tables created with this tool in reverse dependency order.