pg_copycat

pg_copycat is a CLI tool designed to reduce/copy Postgresql database:

Data are exported base on Datasets defined in EDN files, honeysql 2 is used..
Pg_copycat can be easily automated and files can be regularly exported to / imported from Google Cloud Storage.
- Both export and import supports two main actions (see actions) - structure export / import and data export / import.
- Each export zip file is time-stamped, so you can easily import the structure and datasets back at any time.
- You must have GCLOUD tools installed and authenticated if you want to use Google Cloud Storage.
Main goal: help you develop locally so you do not have to connect to remote Postgres DB.
- Pg_copycat is meant to solve the problem where you want to develop an ETL or report on local ‘replica’.
- Just run a Postgres DB in your Docker and import what you need…
Non goal: replace existing tools for complete database migration.
- Pg_copycat purpose is to reduce big Postgres DWH for your temporary needs, not to create a perfect copy.

Getting Started

Clone the repository.
$ cd pg_copycat/build
$ sudo chmod 755 pg_copycat
Create a configuration file like this (you can have many for multiple DBs):
- I recommend you to use =.pgpass= so you do not have to enter password manually during export or import.
```
{:dbtype   "postgresql"
:dbname   "postgres"
:port      5435
:host     "localhost"
:username "postgres"}
    
```
$ ./pg_copycat -c config.edn -d ../dataset/example.edn export-all
That will start the export of structure and data.
- If you have a problem other than not connecting to the database, try checking the installed software via pg_copycat doctor.
- Maybe you are missing psql for import and pg_dump for export - these are required!
$ ./pg_copycat -c config.edn -d base import-all
- Will start import of all files exported in previous step to the different database server.

Examples

Export structure and data defined in base.edn dataset to Google Cloud Storage:
- $ pg_copycat -c config.edn -d ../dataset/base.edn -g "gs://${BUCKET_NAME}" export-all
Import last exported structure from Google Cloud Storage:
- $ pg_copycat -c config.edn -g "gs://${BUCKET_NAME}" import-structure
Import two specific datasets and do not change the database structure in any way:
- $ pg_copycat -c config.edn -d base_235975 -d extra_3245356 import-data
You can load last file without knowing its full name, use just part of its filename you know.
You can load or export multiple datasets by specifying -d multiple times.

Actions

pg_copycat offers you three (or four) import and export <<actions>>:

export-*structure*: all information related to table/schema definition, sequences, constrains and settings. pg_copycat does not export owner information!
export-*data*: table plain-text data defined in EDN datasets.
export-*complete*: runs structure and data export - so you can do both in one step.

same actions applies for import!

Usage

Run at the command line for options:

$ ./pg_copycat --help

+------------------------+
|   /\    ______  ___    |
|  /  \  /      \/   \   |
| ( /  |/ __    _\ /  )  |
|  \ /  (/ x)  ( x) / )  |
|   \_  (_  )   \ )  /   |
|     \  /\_/    \)_/    |
|      \/  //|  |\\      |
|      /   v |  | v      |
|     /     \__/    \    |
|    /               \   |
|   |     (^^\/^^)    |  |
|   |      \^  ^/     |  |
|   |       \^^/      |  |
|   |        \/       |  |
|    \_______________/   |
|    pg_copycat   0.0.1  |
|                        |
+------------------------+

Usage: pg_copycat [options] action

Options:
  -c, --config-path CONFIG-PATH}  config.edn  A path to EDN config file containing a database configuration.
  -d, --dataset DATASET           []          Specifies the dataset to be imported or exported. Can be used more than once.
  -g, --gcs GCS                               The name of GCS you will use for import or export.
  -f, --filename FILENAME         nil         Search for a file during import - use a partion or an exact filename.
  -H, --help                                  Print this help information
      --version                               Print version and exit

Actions:
  doctor              Check installed software.
  export-structure    Export DB structure to a timestamped ZIP file.
  export-data         Export dataset(s)' ZIP file(s).
  export-all          Run export-structure and export-data.
  import-structure    Import structure ZIP file.
  import-data         Import dataset(s)' ZIP file(s).
  import-all          Run import-structure and import-data..

Please refer to the manual page for more information.

Dataset

<<Datasets>> are defined in EDN files - each dataset is a vector containing any number of hash-maps.
Using hash-map {:name xxx} will generate select * from xxx , but you can add additional keywords like :where and so on.
- These extra parameters should be compatible with honeysql v2.
You can see a sample dataset in pg_copycat/dataset/example.edn.
Example of hash-maps and produced SQL queries:

{:name :public.foo
 :where  [:= :foo.a "baz"]}

=> "SELECT * FROM public.foo WHERE foo.a = baz"

{:name :public.foo
 :order-by [:foo.baz :desc]
 :limit 50}
 
=> "SELECT * FROM foo ORDER BY foo.baz DESC LIMIT 50 "

Installation

pg_copycat is distributed for linux and macos as a single executable file.

you must install https://github.com/babashka/babashka:
- if you run $ install.sh from this folder it might install babashka for you.
Copy the build/pg_copycat file to:
- $HOME/bin for local installation
- /usr/local for system installation

Development

dependencies are defined in bb.edn
source code is in /src folder
Once you do your changes you might run compile.sh:
- it will produce babashka’s uberscript in /build folder.
- it will pack all dependencies but babashka itself.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
build		build
dataset		dataset
src/pg_copycat		src/pg_copycat
test/pg_copycat		test/pg_copycat
bb.edn		bb.edn
compile.sh		compile.sh
config.edn		config.edn
deps.edn		deps.edn
install.sh		install.sh
readme.org		readme.org
test-runner.clj		test-runner.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pg_copycat

Getting Started

Examples

Actions

Usage

Dataset

Installation

Development

About

Releases

Packages

Languages

zikajk/pg_copycat

Folders and files

Latest commit

History

Repository files navigation

pg_copycat

Getting Started

Examples

Actions

Usage

Dataset

Installation

Development

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages