Skip to content

Commit

Permalink
Update connect-redshift-postgresql-alloydb.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Nov 7, 2023
1 parent 5c1fb7c commit d02c71f
Showing 1 changed file with 9 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -61,38 +61,34 @@ chmod 600 ~/.ssh/authorized_keys

The Bastion server should now be ready for dbt Cloud to use as a tunnel into the Redshift environment.

#### Intermittent Connection Issues
#### Intermittent connection issues

<details>
<summary>Database Error - could not connect to server: Connection timed out</summary>
<div>
<div>When you configure a connection to a database via an SSH tunnel -- typically you have the following components in play:
When you configure a connection to a database via an SSH tunnel -- typically you have the following components in play:
- An Elastic Load Balancer (ELB) or Network Load Balancing (NLB) instance.
- A bastion host (aka jump server) running the `sshd` process
- A Database (ex. Redshift cluster)
dbt Cloud establishes an SSH tunnel by connecting through the ELB/NLB to the `sshd` process which then is responsible for passing traffic to the database.
When dbt initiates a job run, it establishes an SSH tunnel at the beginning of the job run and if at any point the SSH tunnel fails, the job will fail.
- A bastion host (aka jump server) running the <code>sshd</code> process
- A Database (such as Redshift cluster)
dbt Cloud establishes an SSH tunnel by connecting through the ELB/NLB to the <code>sshd</code> process which then is responsible for passing traffic to the database.
When dbt initiates a job run, it establishes an SSH tunnel at the beginning of the job run, and if at any point the SSH tunnel fails, the job will fail.

The most common causes of tunnel failures are:
- The SSH daemon terminates the session due to an idle timeout
- The connection is terminated by ELB or NLB due to an idle timeout

dbt Cloud sets a value for its SSH tunnel called `ServerAliveInterval` and `ServerAliveCountMax` that polls the connection every 30 seconds and the underlying OS in our run "pods" will terminate the connection if the `sshd` process fails to respond after 300s. This will, in many cases, prevent an idle timeout entirely so longer as the customer is not using ELB with a firewall-level idle timeout of less than 30 seconds. However, if the customer is using ELB and is using an Idle Connection Timeout of less than 30s, this will be insufficient to prevent tunnels from being terminated.
dbt Cloud sets a value for its SSH tunnel called `ServerAliveInterval` and `ServerAliveCountMax` that polls the connection every 30 seconds and the underlying OS in our run "pods" will terminate the connection if the `sshd` process fails to respond after 300s. This will, in many cases, prevent an idle timeout entirely so long as the customer is not using ELB with a firewall-level idle timeout of less than 30 seconds. However, if the customer is using ELB and is using an Idle Connection Timeout of less than 30s, this will be insufficient to prevent tunnels from being terminated.

Some versions of Linux used on bastion hosts use a verison of `sshd` with additional idle timeout settings:
Some versions of Linux used on bastion hosts use a version of `sshd` with additional idle timeout settings:
`ClientAliveCountMax`
This value sets the number of client alive messages which may be sent without `sshd` receiving any messages back from the client. If this threshold is reached while client alive messages are being sent, sshd will disconnect the client, terminating the session. The client alive mechanism is helpful when the client or server needs to know when a connection has become inactive. The default value is 3.
This value sets the number of client alive messages that may be sent without `sshd` receiving any messages back from the client. If this threshold is reached while client alive messages are being sent, `sshd` will disconnect the client, terminating the session. The client-alive mechanism is helpful when the client or server needs to know when a connection has become inactive. The default value is 3.
`ClientAliveInterval`
This value sets a timeout interval in seconds after which if no data has been received from the client, `sshd` will send a message through the encrypted channel to request a response from the client. The default is 0, indicating that these messages will not be sent to the client.

Using default values, tunnels could be terminated prematurely by `sshd`. To solve this problem, the `/etc/ssh/sshd_config` file on the bastion host can be configured with the following values:
`ClientAliveCountMax` 10
`ClientAliveInterval` 30
where `ClientAliveCountMax` should be set to a non-zero value and `ClientAliveInterval` should be a value less than the ELB or NLB idle timeout value. Using the suggested values, unresponsive SSH clients will be disconnected after approximately 300 seconds.
</div>
</div>
</details>
could not connect to server: Connection timed out


## Configuration
Expand Down

0 comments on commit d02c71f

Please sign in to comment.