Skip to content

Commit

Permalink
Add SQLite demo (#22)
Browse files Browse the repository at this point in the history
* add sqlite demo
  • Loading branch information
mgramin authored Oct 17, 2023
1 parent e995a7e commit 3d80e90
Show file tree
Hide file tree
Showing 9 changed files with 150 additions and 0 deletions.
38 changes: 38 additions & 0 deletions .github/workflows/test_sqlite.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: test_sqlite

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

defaults:
run:
working-directory: ./sqlite

jobs:

generation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Build docker-compose
run: |
docker-compose pull
docker-compose build
- name: Make the database file writable
run: |
sudo chown $USER:$USER ./output/output.db
chmod -R 777 ./output/output.db
- name: Run the KEEP mode
run: |
export CONFIG_FILE=config_keep.tdk.yaml
docker-compose run tdk
- name: Run the GENERATION mode
run: |
export CONFIG_FILE=config_generation.tdk.yaml
docker-compose run tdk
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@ data

postgres/logs
mysql/logs

sqlite/output/output.db
35 changes: 35 additions & 0 deletions sqlite/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# TDK SQLite Demo

[SQL Murder Mystery](https://mystery.knightlab.com) is an interactive online game that allows users to learn and practice SQL skills while solving a murder mystery. With the help of TDK, we will enhance your investigation in SQL City, making it even more interesting and sophisticated.

To start our investigation, we should review a crime scene report. Specifically, we need to create a SELECT query from the `crime_scene_report` table to retrieve the necessary information. Currently, there are 1228 records in this table. Additionally, by using TDK, we can generate additional records to enhance the complexity and engagement of our investigation.

So, first, let's create a copy of a SQL Murder Mystery database using the [KEEP](https://docs.synthesized.io/tdk/latest/user_guide/tutorial/masking) mode of TDK. Ensure that there are still 1228 crime scenes, 9 unique types of crime, 377 unique cities where crimes have been reported, and the specific crime scene where the investigated murder was conducted:

```shell
export CONFIG_FILE=config_keep.tdk.yaml
docker-compose run tdk

usql -q sqlite3://sql-murder-mystery.db -f control_query.sql

description | all_scenes | all_types | all_cities
-------------------------------------------------------+------------+-----------+------------
Security footage shows that there were 2 witnesses... | 1228 | 9 | 377
```

Next, we can expand the `crime_scene_report` table by 10 times using the [GENERATION](https://docs.synthesized.io/tdk/latest/user_guide/tutorial/generation) mode of TDK. This expansion will introduce a new crime type (`cyber crimes`) and add new cities. You can refer to the corresponding [config file](config_generation.tdk.yaml) for more details:

```shell
export CONFIG_FILE=config_generation.tdk.yaml
docker-compose run tdk

usql -q sqlite3://output/output.db -f control_query.sql

description | all_scenes | all_types | all_cities
-------------------------------------------------------+------------+-----------+------------
Security footage shows that there were 2 witnesses... | 13508 | 10 | 11935
```

As a result, the number of crime scenes will increase to 13508, with 12280 new crime scenes. Additionally, there will be 10 different types of crime and 11935 different cities. Furthermore, the `crime_scene_report` table still contains the records of the concrete investigated crime scenes. This expansion will add to the intrigue of our investigation.

Now we have more data in the database, and we can try to find out who committed the murder! We also have a small notice: "The Butler Didn't Do It" :smirk:
32 changes: 32 additions & 0 deletions sqlite/config_generation.tdk.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
default_config:
mode: GENERATION
target_ratio: 0

tables:
- table_name_with_schema: "crime_scene_report"
target_ratio: 10
transformations:
- columns: ["type"]
params:
type: "categorical_generator"
categories:
type: STRING
value_source: PROVIDED
values:
"arson": 0.1
"assault": 0.1
"blackmail": 0.1
"bribery": 0.1
"fraud": 0.1
"murder": 0.1
"robbery": 0.1
"smuggling": 0.1
"theft": 0.1
"cyber crimes": 0.1
- columns: [ "city" ]
params:
type: address_generator
column_templates: [ "${city}" ]

table_truncation_mode: IGNORE
schema_creation_mode: DO_NOT_CREATE
5 changes: 5 additions & 0 deletions sqlite/config_keep.tdk.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
default_config:
mode: KEEP

table_truncation_mode: TRUNCATE
schema_creation_mode: CREATE_IF_NOT_EXISTS
8 changes: 8 additions & 0 deletions sqlite/control_query.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
select substr(description, 0, 52) || '..' as description
, (select count(1) from crime_scene_report) as all_scenes
, (select count(distinct type) from crime_scene_report) as all_types
, (select count(distinct city) from crime_scene_report) as all_cities
from crime_scene_report
where city = 'SQL City'
and type = 'murder'
and date = '20180115';
29 changes: 29 additions & 0 deletions sqlite/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
version: '3.5'

services:

tdk:
container_name: tdk
image: synthesizedio/synthesized-tdk-cli:latest
environment:
SYNTHESIZED_INPUT_URL: jdbc:sqlite:/app/input.db
SYNTHESIZED_OUTPUT_URL: jdbc:sqlite:/app/output.db
SYNTHESIZED_USERCONFIG_FILE: /app/config.yaml
TDK_WORKINGDIRECTORY_PATH: /app/data
TDK_WORKINGDIRECTORY_ENABLED: "true"
JAVA_TOOL_OPTIONS: >
-Dlogging.level.io.synthesized.testingsuite.executor.lite.LiteTransformer=INFO
-Dlogging.level.io.synthesized.testingsuite=WARN
-Dlogging.level.com.zaxxer.hikari=WARN
-Dlogging.level.org.reflections=WARN
-Dlogging.level.org.jooq=WARN
-XX:+UseContainerSupport
-XX:MaxRAMPercentage=80.0
-Dspring.main.banner-mode=CONSOLE
-Dspring.banner.location=file:/app/banner.txt
volumes:
- ./${CONFIG_FILE}:/app/config.yaml
- ../banner.txt:/app/banner.txt
- ../logback-lite-executor.xml:/app/logback-lite-executor.xml
- ./sql-murder-mystery.db:/app/input.db
- ./output/output.db:/app/output.db
1 change: 1 addition & 0 deletions sqlite/output/output.db
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Binary file added sqlite/sql-murder-mystery.db
Binary file not shown.

0 comments on commit 3d80e90

Please sign in to comment.