Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect events #2511

Merged
merged 4 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
233 changes: 229 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,235 @@
# Release 1.1.16
# Release 1.2.0

## What's New

* [Issue #2468](https://github.com/openziti/ziti/issues/2468) - Controller configuration `edge.api.address` selects the
enrollment token signer by matching the DNS SAN of a server certificate. If more than one server certificate
matches then the wrong key may sign the tokens causing enrollments to fail with a signature verification error.
* New Router Metrics
* Changes to identity connect status
* HA Bootstrap Changes
* Connect Events
* SDK Events
* Bug fixes and other HA work

## New Router Metrics

The following new metrics are available for edge routers:

1. edge.connect.failures - meter tracking failed connect attempts from sdks
This tracks failures to not having a valid token. Other failures which
happen earlier in the connection process may not be tracked here.
2. edge.connect.successes - meter tracking successful connect attempts from sdks
3. edge.disconnects - meter tracking disconnects of previously successfully connected sdks
4. edge.connections - gauge tracking count of currently connected sdks

## Identity Connect Status

Ziti tracks whether an identity is currently connected to an edge router.
This is the `hasEdgeRouterConnection` field on Identity.

Identity connection status used to be driven off of heartbeats from the edge router.
This feature doesn't work correctly when running with controller HA.

To address this, while also providing more operation insight, connect events were added
(see below for more details on the events themselves).

The controller can be configured to use status from heartbeats, connect events or both.
If both are used as source, then if either reports the identity as connected, then it
will show as connected. This is intended for when you have a mix of routers and they
don't all yet supported connect events.

The controller now also aims to be more precise about identity state. There is a new
field on Identity: `edgeRouterConnectionStatus`. This field can have one of three
values:

* offline
* online
* unknown

If the identity is reported as connected to any ER, it will be marked as `online`.
If the identity has been reported as connected, but the reporting ER is now
offline, the identity may still be connected to the ER. While in this state
it will be marked as 'unknown'. After a configurable interval, it will be marked
as offline.

New controller config options:

```
identityStatusConfig:
# valid values ['heartbeats', 'connect-events', 'hybrid']
# defaults to 'hybrid' for now
source: connect-events

# determines how often we scan for disconnected routers
# defaults to 1 minute
scanInterval: 1m

# determines how long an identity will stay in unknown status before it's marked as offline
# defaults to 5m
unknownTimeout: 5m
```

## HA Bootstrapping Changes

Previously bootstrapping the RAFT cluster and initializing the controller with a
default administrator were separate operations.
Now, the raft cluster will be bootstrapped whenever the controller is initialized.

The controller can be initialized as follows:

1. Using `ziti agent controller init`
2. Using `ziti agent controller init-from-db`
3. Specifying a `db:` entry in the config file. This is equivalent to using `ziti agent controller init-from-db`.

Additionally:

1. `minClusterSize` has been removed. The cluster will always be initialized with a size of 1.
2. `bootstrapMembers` has been renamed to `initialMembers`. If `initialMembers` are specified,
the bootstrapping controller will attempt to add them after bootstrap has been complete. If
they are invalid they will be ignored. If they can't be reached (because they're not running
yet), the controller will continue to retry until they are reached, or it is restarted.


## Connect Events

These are events generated when a successful connection is made to a controller, from any of:

1. Identity, using the REST API
2. Router
3. Controller (peer in an HA cluster)

They are also generated when an SDK connects to a router.

**Controller Configuration**

```yml
events:
jsonLogger:
subscriptions:
- type: connect
handler:
type: file
format: json
path: /tmp/ziti-events.log
```

**Router Configuration**
```yml
connectEvents:
# defaults to true.
# If set to false, minimal information about which identities are connected will still be
# sent to the controller, so the `edgeRouterConnectionStatus` field can be populated,
# but connect events will not be generated.
enabled: true

# The interval at which connect information will be batched up and sent to the controller.
# Shorter intervals will improve data resolution on the controller. Longer intervals could
# more efficient.
batchInterval: 3s

# The router will also periodically sent the full state to the controller, to ensure that
# it's in sync. It will do this automatically if the router gets disconnected from the
# controller, or if the router is unable to send a connect events messages to the controller.
# This controls how often the full state will be sent under ordinairy conditions
fullSyncInterval: 5m

# If enabled is set to true, the router will collect connect events and send them out
# at the configured batch interval. If there are a huge number of connecting identities
# or if the router is disconnected from the controller for a time, it may be unable to
# send events. In order to prevent queued events from exhausting memory, a maximum
# queue size is configured.
# Default value 100,000
maxQueuedEvents: 100000

```

**Example Events**

```json
{
"namespace": "connect",
"src_type": "identity",
"src_id": "ji2Rt8KJ4",
"src_addr": "127.0.0.1:59336",
"dst_id": "ctrl_client",
"dst_addr": "localhost:1280/edge/management/v1/edge-routers/2L7NeVuGBU",
"timestamp": "2024-10-02T12:17:39.501821249-04:00"
}
{
"namespace": "connect",
"src_type": "router",
"src_id": "2L7NeVuGBU",
"src_addr": "127.0.0.1:42702",
"dst_id": "ctrl_client",
"dst_addr": "127.0.0.1:6262",
"timestamp": "2024-10-02T12:17:40.529865849-04:00"
}
{
"namespace": "connect",
"src_type": "peer",
"src_id": "ctrl2",
"src_addr": "127.0.0.1:40056",
"dst_id": "ctrl1",
"dst_addr": "127.0.0.1:6262",
"timestamp": "2024-10-02T12:37:04.490859197-04:00"
}
```

## SDK Events

Building off of the connect events, there are events generated when an identity/sdk comes online or goes offline.

```yml
events:
jsonLogger:
subscriptions:
- type: sdk
handler:
type: file
format: json
path: /tmp/ziti-events.log
```

```json
{
"namespace": "sdk",
"event_type" : "sdk-online",
"identity_id": "ji2Rt8KJ4",
"timestamp": "2024-10-02T12:17:39.501821249-04:00"
}

{
"namespace": "sdk",
"event_type" : "sdk-status-unknown",
"identity_id": "ji2Rt8KJ4",
"timestamp": "2024-10-02T12:17:40.501821249-04:00"
}

{
"namespace": "sdk",
"event_type" : "sdk-offline",
"identity_id": "ji2Rt8KJ4",
"timestamp": "2024-10-02T12:17:41.501821249-04:00"
}
```

## Component Updates and Bug Fixes

* github.com/openziti/channel/v3: [v3.0.5 -> v3.0.7](https://github.com/openziti/channel/compare/v3.0.5...v3.0.7)
* github.com/openziti/edge-api: [v0.26.32 -> v0.26.35](https://github.com/openziti/edge-api/compare/v0.26.32...v0.26.35)
* github.com/openziti/identity: [v1.0.85 -> v1.0.87](https://github.com/openziti/identity/compare/v1.0.85...v1.0.87)

* github.com/openziti/sdk-golang: [v0.23.43 -> v0.23.44](https://github.com/openziti/sdk-golang/compare/v0.23.43...v0.23.44)
* github.com/openziti/transport/v2: [v2.0.146 -> v2.0.148](https://github.com/openziti/transport/compare/v2.0.146...v2.0.148)
* github.com/openziti/ziti: [v1.1.15 -> v1.2.0](https://github.com/openziti/ziti/compare/v1.1.15...v1.2.0)
* [Issue #2212](https://github.com/openziti/ziti/issues/2212) - Rework distributed control bootstrap mechanism
* [Issue #1835](https://github.com/openziti/ziti/issues/1835) - Add access log for rest and router connections
* [Issue #2234](https://github.com/openziti/ziti/issues/2234) - Emit an event when hasEdgeRouterConnection state changes for an Identity
* [Issue #2491](https://github.com/openziti/ziti/issues/2491) - fix router CSR loading
* [Issue #2478](https://github.com/openziti/ziti/issues/2478) - JWT signer secondary auth: not enough information to continue
* [Issue #2482](https://github.com/openziti/ziti/issues/2482) - router run command - improperly binds 127.0.0.1:53/udp when tunnel mode is not tproxy
* [Issue #2474](https://github.com/openziti/ziti/issues/2474) - Enable Ext JWT Enrollment/Generic Trust Bootstrapping
* [Issue #2471](https://github.com/openziti/ziti/issues/2471) - Service Access for Legacy SDKs in HA does not work
* [Issue #2468](https://github.com/openziti/ziti/issues/2468) - enrollment signing cert is not properly identified


# Release 1.1.15

Expand Down
62 changes: 62 additions & 0 deletions common/inspect/connect_inspections.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/*
Copyright NetFoundry Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package inspect

const (
RouterIdentityConnectionStatusesKey = "identity-connection-statuses"
)

type RouterIdentityConnections struct {
IdentityConnections map[string]*RouterIdentityConnectionDetail `json:"identity_connections"`
LastFullSync string `json:"last_full_sync"`
QueuedEventCount int64 `json:"queued_event_count"`
MaxQueuedEvents int64 `json:"max_queued_events"`
NeedFullSync []string `json:"need_full_sync"`
BatchInterval string `json:"batch_interval"`
FullSyncInterval string `json:"full_sync_interval"`
}

type RouterIdentityConnectionDetail struct {
UnreportedCount uint64 `json:"unreported_count"`
UnreportedStateChanged bool `json:"unreported_state_changed"`
BeingReportedCount uint64 `json:"being_reported_count"`
BeingReportedStateChanged bool `json:"being_reported_state_changed"`
Connections []*RouterConnectionDetail `json:"connections"`
}

type RouterConnectionDetail struct {
Id string `json:"id"`
Closed bool `json:"closed"`
SrcAddr string `json:"srcAddr"`
DstAddr string `json:"dstAddr"`
}

type CtrlIdentityConnections struct {
Connections map[string]*CtrlIdentityConnectionDetail `json:"connections"`
ScanInterval string `json:"scanInterval"`
}

type CtrlIdentityConnectionDetail struct {
ConnectedRouters map[string]*CtrlRouterConnection `json:"connected_routers"`
LastReportedState string `json:"last_reported_state"`
}

type CtrlRouterConnection struct {
RouterId string `json:"router_id"`
Closed bool `json:"closed"`
TimeSinceLastWrite string `json:"time_since_last_write"`
}
6 changes: 3 additions & 3 deletions common/oidc_tokens.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ func (r *RefreshClaims) GetIssuer() (string, error) {
}

func (r *RefreshClaims) GetSubject() (string, error) {
return r.TokenClaims.Issuer, nil
return r.TokenClaims.Subject, nil
}

func (r *RefreshClaims) GetAudience() (jwt.ClaimStrings, error) {
Expand Down Expand Up @@ -211,7 +211,7 @@ func (r *AccessClaims) GetIssuer() (string, error) {
}

func (r *AccessClaims) GetSubject() (string, error) {
return r.TokenClaims.Issuer, nil
return r.TokenClaims.Subject, nil
}

func (r *AccessClaims) GetAudience() (jwt.ClaimStrings, error) {
Expand Down Expand Up @@ -260,7 +260,7 @@ func (r *IdTokenClaims) GetIssuer() (string, error) {
}

func (r *IdTokenClaims) GetSubject() (string, error) {
return r.TokenClaims.Issuer, nil
return r.TokenClaims.Subject, nil
}

func (r *IdTokenClaims) GetAudience() (jwt.ClaimStrings, error) {
Expand Down
2 changes: 1 addition & 1 deletion common/pb/edge_cmd_pb/edge_cmd.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading