diff --git a/.Dockerignore b/.Dockerignore new file mode 100644 index 0000000..85ecd3f --- /dev/null +++ b/.Dockerignore @@ -0,0 +1,7 @@ +# Ignore all +* +# Except `src` folder and `project.clj` +!bin +!resources +!src +!project.clj diff --git a/.gitignore b/.gitignore index 04f438a..6f09c10 100644 --- a/.gitignore +++ b/.gitignore @@ -2,3 +2,9 @@ .lein-failures target/ todo.org +.mypy_cache +.pytest_cache + +# Config sample is saved as `config.json.template` +# Real `config.json` files (with actual creds) should always be ignored. +config.json diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 0000000..90951fe --- /dev/null +++ b/Dockerfile @@ -0,0 +1,22 @@ +# Updates pushed via: +# > docker build -t dataopstk/tapdance:tap-mssql-raw . +# > docker push dataopstk/tapdance:tap-mssql-raw + +FROM clojure:openjdk-8-lein + +RUN mkdir -p /home/tap-mssql + +WORKDIR /home/tap-mssql + +COPY ./bin /home/tap-mssql/bin +COPY ./resources /home/tap-mssql/resources +COPY ./src /home/tap-mssql/src +COPY ./project.clj /home/tap-mssql/ + +# Installs files on first run: +RUN cd /home/tap-mssql && \ + ./bin/tap-mssql + +ENV PATH "/home/tap-mssql/bin:${PATH}" + +ENTRYPOINT [ "tap-mssql" ] diff --git a/README.md b/README.md index d91d6ec..0303394 100644 --- a/README.md +++ b/README.md @@ -6,127 +6,73 @@ ## Requirements -This tap is written in Clojure, and as such, requires the JVM. It has been consistently tested to run using `OpenJDK 8`, which can be installed on Ubuntu using these commands. +This tap is written in [Clojure](https://clojure.org/), and as such, requires Java and [Leiningen](https://leiningen.org/). -``` -apt-get update && apt-get install -y openjdk-8-jdk -``` +* For full installation instructions, please see the [installation guide](docs/installation.md). -Associated tooling required to use the scripts in this repository follow. (Running the latest versions) +## Onboarding for Developers and Testers -- [**Leiningen**](https://leiningen.org/) -- [**Docker (for integration tests)**](https://www.docker.com/) -- [**MSSQL CLI (to connect to test database)**](https://docs.microsoft.com/en-us/sql/tools/mssql-cli?view=sql-server-2017) +To get started as a contributor and/or tester, see the [contribution guidelines](docs/CONTRIBUTING.md) and [developer guide](docs/dev_guide.md). -## Quick Start +## Configuring `tap-mssql` -``` -$ bin/tap-mssql --config config.json --discover > catalog.json -$ bin/tap-mssql --config config.json --catalog catalog.json --state state.json | target... -``` +At minimum, the tap requires the following settings: `host`, `port`, `database`, `username`, and `password`. -## Usage +* For detailed configuration instructions, including a list of supported settings, please check the [configuration guide](docs/config.md). -In the `bin` folder, there are a few utility scripts to simplify interacting with this tap. Many of these scripts rely on some environment variables being set, see "Testing Infrastructure Design" for more information. +## Running the tap -**bin/tap-mssql** - This script wraps the `lein` command to run the tap from source code. It is analogous to the command installed by setuptools in Python taps. +The tap supports several different wrappers and execution patterns for different types of environments. -As this is a Clojure tap, it supports a non-standard mode of operation by passing the `--repl` flag. This will start an NREPL server and log the port that it is running on to connect from an IDE for REPL driven development. It is compatible with all other command-line arguments, or can be used on its own. If the tap is invoked in discovery or sync mode along with `--repl`, the process will be kept alive after the usual Singer process is completed. +### Running in production -``` -Example: -# Discovery -$ bin/tap-mssql --config config.json --discover > catalog.json +When executing in production, the following patterns are generally recommended: -# Sync -$ bin/tap-mssql --config config.json --catalog catalog.json --state state.json +1. Executing using `lein` directly (platform agnostic): -# REPL Mode -$ bin/tap-mssql --config config.json --repl -``` + ```bash + # Discover metadata catalog: + lein run -m tap-mssql.core --config config.json --discover > catalog.json -**bin/test** - This script wraps `lein test` in order to run the Clojure unit and integration tests against a database running locally. + # Execute sync to target-csv (for example): + lein run -m tap-mssql.core --config config.json --sync | target-csv > state.json + ``` -``` -Example: -$ bin/test -``` +2. Executing using the `tap-mssql` shell script wrapper (Linux/Mac only): -**bin/test-db** - This script uses docker to run a SQL Server container locally that can be used to run the unit tests against. See the usage text for more information. + ```bash + # Discover metadata catalog: + bin/tap-mssql --config config.json --discover > catalog.json -Note: It also depends on the `mssql-cli` tool being installed in order to use the `connect` option. + # Execute sync to target-csv (for example): + bin/tap-mssql --config config.json --sync | target-csv > state.json + ``` -``` -Example: -$ bin/test-db start -$ bin/test-db connect -$ bin/test-db stop -``` +3. Instructions to execute using docker: + * ***TK - TODO: Which image name to use in place of `local/tap-mssql` below?*** + * Build the docker image: -**bin/circleci-local** - This script wraps the [`circleci` CLI tool](https://circleci.com/docs/2.0/local-cli/) to run the Clojure unit and integration tests in the way CircleCI does, on localhost. + ```bash + docker build -t local/tap-mssql . + ``` -``` -Example: -$ bin/circleci-local -``` + * Run using docker: -## Testing Infrastructure Design + ```bash + # Discover metadata catalog: + docker run --rm -it -v .:/home/tap-mssql local/tap-mssql --config config.json --discover > catalog.json -Each actor (developer, CI, etc.) needs their own testing infrastructure so -that development can proceed and be verified independently of each other. -In order to provide this isolation, we've migrated towards a Docker-based -solution. + # Execute sync to target-csv (for example): + docker run --rm -it -v .:/home/tap-mssql local/tap-mssql --config config.json --sync | target-csv > state.json + ``` -A script, `bin/test-db` has been provided that will honor several -environment variables and manage the container required by the development -and testing. +### Other ways to run and test -The environment variables are: +For more ways to execute the tap, including dockerized methods, REPL methods, and various other testing configurations, see the the [Developers Guide](docs/dev_guide.md). -| name | description | -| --- | --- | -| `STITCH_TAP_MSSQL_TEST_DATABASE_USER` | The admin user that should be used to connect to the test database (for docker, this is SA) | -| `STITCH_TAP_MSSQL_TEST_DATABASE_PASSWORD` | The password for the user (if docker, the SA user will be configured with this password) | -| `STITCH_TAP_MSSQL_TEST_DATABASE_PORT` | The port for hosting the server. (Default 1433)| +## Troubleshooting -To interact with the container, these commands are available: - -`bin/test-db start` - Starts the container under the name `sql1` - -`bin/test-db connect` - Uses `mssql-cli` to open a shell to the local MSSQL instance - -`bin/test-db stop` - Tears down and removes the container - -**Note:** There is no volume binding, so all of the data and state in the - running container is entirely ephemeral - -## Observed error messages: - -``` -# Bad Host Message - -The TCP/IP connection to the host charnock.org, port 51552 has failed. -Error: "connect timed out. Verify the connection properties. Make sure -that an instance of SQL Server is running on the host and accepting -TCP/IP connections at the port. Make sure that TCP connections to the -port are not blocked by a firewall.". - -# Unspecified azure server error message - -Cannot open server "127.0.0.1" requested by the login. The login -failed. ClientConnectionId:33b6ae38-254a-483b-ba24-04d69828fe0c - - -# Bad dbname error message - -Login failed for user 'foo'. -ClientConnectionId:4c47c255-a330-4bc9-94bd-039c592a8a31 - -# Database does not exist - -Cannot open database "foo" requested by the login. The login -failed. ClientConnectionId:f6e2df79-1d72-4df3-8c38-2a9e7a349003 -``` +For help common errors, please see the [troubleshooting guide](docs/troubleshooting.md). --- diff --git a/config.json.template b/config.json.template new file mode 100644 index 0000000..37743c6 --- /dev/null +++ b/config.json.template @@ -0,0 +1,8 @@ +{ + "host": "mytestsqlserver.cqaqbfvfo67k.us-east-1.rds.amazonaws.com", + "port": "1433", + "database": "sales_db", + "user": "automation_user", + "password": "t0p-sEcr3t!", + "ssl": true +} diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md new file mode 100644 index 0000000..6a28a28 --- /dev/null +++ b/docs/CONTRIBUTING.md @@ -0,0 +1,3 @@ +# Contributor Guide + +TK - TODO: Any specific guidelines on how to contribute, raise bugs, etc.? diff --git a/docs/config.md b/docs/config.md new file mode 100644 index 0000000..3519316 --- /dev/null +++ b/docs/config.md @@ -0,0 +1,30 @@ +# MS-SQL Tap Config Settings + +_This page documents the various settings supported by `tap-mssql`._ + +## Available Settings + +| setting | description | +| -------- | ----------------------------------------- | +| host | The SQL Server IP address or server name. | +| port | The port to use when connecting. | +| database | The database name. | +| user | The user name for connection. | +| password | The user name for connection. | +| ssl | 'True' to use SSL, otherwise 'False. | + +* ***[TK - TODO: How is log-based replication configured?]*** +* ***[TK - TODO: Any other configurable settings?]*** + +## Sample `settings.json` file + +```json +{ + "host": "mytestsqlserver.cqaqbfvfo67k.us-east-1.rds.amazonaws.com", + "port": "1433", + "database": "sales_db", + "user": "automation_user", + "password": "t0p-sEcr3t!", + "ssl": true +} +``` diff --git a/docs/dev_guide.md b/docs/dev_guide.md new file mode 100644 index 0000000..6119042 --- /dev/null +++ b/docs/dev_guide.md @@ -0,0 +1,93 @@ +# Developers Guide + +Associated tooling required to use the scripts in this repository follow. (Running the latest versions) + +- [**Leiningen**](https://leiningen.org/) +- [**Docker (for integration tests)**](https://www.docker.com/) +- [**MSSQL CLI (to connect to test database)**](https://docs.microsoft.com/en-us/sql/tools/mssql-cli?view=sql-server-2017) + +## Quick Start + +```bash +$ bin/tap-mssql --config config.json --discover > catalog.json +$ bin/tap-mssql --config config.json --catalog catalog.json --state state.json | target... +``` + +## Usage + +In the `bin` folder, there are a few utility scripts to simplify interacting with this tap. Many of these scripts rely on some environment variables being set, see "Testing Infrastructure Design" for more information. + +**bin/tap-mssql** - This script wraps the `lein` command to run the tap from source code. It is analogous to the command installed by setuptools in Python taps. + +As this is a Clojure tap, it supports a non-standard mode of operation by passing the `--repl` flag. This will start an NREPL server and log the port that it is running on to connect from an IDE for REPL driven development. It is compatible with all other command-line arguments, or can be used on its own. If the tap is invoked in discovery or sync mode along with `--repl`, the process will be kept alive after the usual Singer process is completed. + +```bash +Example: +# Discovery +$ bin/tap-mssql --config config.json --discover > catalog.json + +# Sync +$ bin/tap-mssql --config config.json --catalog catalog.json --state state.json + +# REPL Mode +$ bin/tap-mssql --config config.json --repl +``` + +**bin/test** - This script wraps `lein test` in order to run the Clojure unit and integration tests against a database running locally. + +```bash +Example: +$ bin/test +``` + +**bin/test-db** - This script uses docker to run a SQL Server container locally that can be used to run the unit tests against. See the usage text for more information. + +Note: It also depends on the `mssql-cli` tool being installed in order to use the `connect` option. + +```bash +# Example: +$ bin/test-db start +$ bin/test-db connect +$ bin/test-db stop +``` + +**bin/circleci-local** - This script wraps the [`circleci` CLI tool](https://circleci.com/docs/2.0/local-cli/) to run the Clojure unit and integration tests in the way CircleCI does, on localhost. + +```bash +Example: +$ bin/circleci-local +``` + +## Testing Infrastructure Design + +Each actor (developer, CI, etc.) needs their own testing infrastructure so +that development can proceed and be verified independently of each other. +In order to provide this isolation, we've migrated towards a Docker-based +solution. + +A script, `bin/test-db` has been provided that will honor several +environment variables and manage the container required by the development +and testing. + +The environment variables are: + +| name | description | +| --- | --- | +| `STITCH_TAP_MSSQL_TEST_DATABASE_USER` | The admin user that should be used to connect to the test database (for docker, this is SA) | +| `STITCH_TAP_MSSQL_TEST_DATABASE_PASSWORD` | The password for the user (if docker, the SA user will be configured with this password) | +| `STITCH_TAP_MSSQL_TEST_DATABASE_PORT` | The port for hosting the server. (Default 1433)| + +To interact with the container, these commands are available: + +`bin/test-db start` - Starts the container under the name `sql1` + +`bin/test-db connect` - Uses `mssql-cli` to open a shell to the local MSSQL instance + +`bin/test-db stop` - Tears down and removes the container + +**Note:** There is no volume binding, so all of the data and state in the + running container is entirely ephemeral + +## Troubleshooting + +For help common errors, please see the [troubleshooting guide](docs/troubleshooting.md). diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 0000000..5db07da --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,122 @@ +# Installation Guide + +## Minimal Install (Run Only) + +At minimum, the tap needs the following components working in order to run locally: + +1. Java 8 ([JRE or JDK](https://stackoverflow.com/a/1906455/4298208) both okay [TK - TODO: Confirm this works with JRE]) +2. [Leiningen](https://leiningen.org/) ([Clojure](https://clojure.org/) execution framework and installer) + +### Windows + +_These instructions require the [Chocolatey](chocolatey.org) package manager to automate the install process._ + +1. Install Java: + + ```cmd + # Install the runtime only: + choco install javaruntime + + # OR install the full JDK: + choco install jdk8 + ``` + +2. Install Leiningen + + ```cmd + choco install lein + ``` + +3. Install git (if not already installed) + + ```cmd + choco install -y git.install --params "/GitOnlyOnPath /SChannel /NoAutoCrlf /WindowsTerminal" + ``` + +4. Clone this repo + + ```cmd + # Optionally, make a new directory: + mkdir c:\Files\Source + cd c:\Files\Source + + # Download the tap: + git clone https://github.com/singer-io/tap-mssql.git + ``` + +5. Test `tap-mssql` using `lein`: + + ```cmd + # Change to the tap-mssql root directory: + cd tap-mssql + + # Test using lein: + lein run -m tap-mssql.core --config config.json --discover + + # NOTE: If working properly, at this point you should receive an error that config.json is not found. + ``` + +* If you've gotten this far, you have successfully installed tap-mssql. You are ready to start [running the tap](../README.md#Running_the_tap). + +### Mac + +_These instructions require the [homebrew](brew.sh) package manager to automate the install process._ + +1. Install Java: + + ```cmd + # Install the runtime only: + brew install javaruntime + + # OR install the full JDK: + brew install jdk8 + ``` + +2. Install Leiningen + + ```cmd + brew install leiningen + ``` + +3. Install git (if not already installed) + + ```cmd + brew install git + ``` + +4. Clone this repo + + ```cmd + # Optionally, make a new directory: + mkdir -p ~/Source + cd ~/Source + + # Download the tap: + git clone https://github.com/singer-io/tap-mssql.git + ``` + +5. Test `tap-mssql` using `lein`: + + ```cmd + # Change to the tap-mssql root directory: + cd tap-mssql + + # Test using lein: + lein run -m tap-mssql.core --config config.json --discover + + # NOTE: If working properly, at this point you should receive an error that config.json is not found. + ``` + +* If you've gotten this far, you have successfully installed tap-mssql. You are ready to start [running the tap](../README.md#Running_the_tap). + +### Linux (Ubuntu) + +This tap has been consistently tested to run using `OpenJDK 8`, which can be installed on Ubuntu using these commands. + +```bash +apt-get update && apt-get install -y openjdk-8-jdk +``` + +## Dev Environment Setup + +- TK - TODO: What additional installation or config needed for dev users? diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md new file mode 100644 index 0000000..7f86fc5 --- /dev/null +++ b/docs/troubleshooting.md @@ -0,0 +1,30 @@ +# Troubleshooting Guide + +_Check below for solutions to common error messages._ + +## Observed error messages + +``` +# Bad Host Message + +The TCP/IP connection to the host charnock.org, port 51552 has failed. +Error: "connect timed out. Verify the connection properties. Make sure +that an instance of SQL Server is running on the host and accepting +TCP/IP connections at the port. Make sure that TCP connections to the +port are not blocked by a firewall.". + +# Unspecified azure server error message + +Cannot open server "127.0.0.1" requested by the login. The login +failed. ClientConnectionId:33b6ae38-254a-483b-ba24-04d69828fe0c + +# Bad dbname error message + +Login failed for user 'foo'. +ClientConnectionId:4c47c255-a330-4bc9-94bd-039c592a8a31 + +# Database does not exist + +Cannot open database "foo" requested by the login. The login +failed. ClientConnectionId:f6e2df79-1d72-4df3-8c38-2a9e7a349003 +``` diff --git a/docs/usage.md b/docs/usage.md new file mode 100644 index 0000000..e69de29 diff --git a/project.clj b/project.clj index 7fe6912..ba28cb4 100644 --- a/project.clj +++ b/project.clj @@ -28,4 +28,4 @@ :plugins [[lein-pprint "1.2.0"]] :main tap-mssql.core :profiles {:uberjar {:aot [tap-mssql.core]} - :system {:java-cmd "/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java"}}) + :system {:java-cmd "java"}})