Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(zetaclient): distinguish between known and connected peers #3208

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

gartnera
Copy link
Member

@gartnera gartnera commented Nov 23, 2024

"Known peers" should not be exposed as "connected peers". They are two completely separate things which should not be mixed to avoid confusion.

Add new /knownpeers endpoint which will expose this info. Expose connected peers in /connectedpeers.

Summary by CodeRabbit

  • New Features

    • Introduced a new HTTP endpoint /knownpeers to retrieve and manage known peers.
    • Enhanced health check functionality to differentiate between known and connected peers.
    • Added methods to manage known peers within the telemetry service.
  • Bug Fixes

    • Improved error handling for JSON marshaling in the known peers response.
  • Documentation

    • Updated documentation to reflect new metrics and functionalities related to known peers.

@gartnera gartnera added the no-changelog Skip changelog CI check label Nov 23, 2024
Copy link
Contributor

coderabbitai bot commented Nov 23, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces enhancements to the telemetry system by modifying the TelemetryServer and HealthcheckWorker. A new field for known peers is added to the TelemetryServer, along with methods to manage these peers and expose them via an HTTP endpoint. The HealthcheckWorker is updated to track known and connected peers separately, with changes to the metrics and error handling. Additionally, a new method to set known peers is introduced in the Telemetry interface, ensuring a cohesive update across the telemetry functionalities.

Changes

File Path Change Summary
zetaclient/metrics/telemetry.go - Added field knownPeers []peer.AddrInfo to TelemetryServer.
- Added methods SetKnownPeers and GetKnownPeers.
- Registered new HTTP route /knownpeers with knownPeersHandler.
- Updated connectedPeersHandler for correct logging.
zetaclient/tss/healthcheck.go - Added field NumKnownPeersMetric prometheus.Gauge to HealthcheckProps.
- Renamed peersCounter to knownPeersCounter and added connectedPeersCounter.
- Updated background ticker to call both counters.
zetaclient/tss/service.go - Added method SetKnownPeers(peers []peer.AddrInfo) to Telemetry interface.

Possibly related PRs

  • fix: replace DHT with private peer discovery #3041: This PR introduces a new method for handling known peers in the TSS server setup, which is directly related to the SetKnownPeers method added in the main PR's TelemetryServer struct, enhancing peer management functionality.
  • fix(zetaclient): infinite discovery address leak #3171: This PR involves updates to dependencies that may indirectly relate to peer management and discovery, but it does not directly modify the same functions or structures as the main PR. However, it is part of the broader context of peer management improvements in the codebase.

Suggested labels

UPGRADE_LIGHT_TESTS

Suggested reviewers

  • fbac
  • skosito
  • ws4charlie
  • lumtis
  • brewmaster012

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Nov 23, 2024

Codecov Report

Attention: Patch coverage is 0% with 37 lines in your changes missing coverage. Please review.

Project coverage is 62.37%. Comparing base (cfcf706) to head (69ca4a8).
Report is 4 commits behind head on develop.

Files with missing lines Patch % Lines
zetaclient/metrics/telemetry.go 0.00% 21 Missing ⚠️
zetaclient/tss/healthcheck.go 0.00% 16 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3208      +/-   ##
===========================================
- Coverage    62.43%   62.37%   -0.07%     
===========================================
  Files          428      428              
  Lines        30404    30434      +30     
===========================================
  Hits         18982    18982              
- Misses       10576    10606      +30     
  Partials       846      846              
Files with missing lines Coverage Δ
zetaclient/tss/service.go 42.23% <ø> (ø)
zetaclient/tss/healthcheck.go 0.00% <0.00%> (ø)
zetaclient/metrics/telemetry.go 0.00% <0.00%> (ø)

@gartnera gartnera force-pushed the fix-connectedpeer-metrics branch from 42c28ca to c7bbd4a Compare November 25, 2024 16:53
@gartnera gartnera force-pushed the fix-connectedpeer-metrics branch from c7bbd4a to 3d164e6 Compare November 25, 2024 17:04
@gartnera gartnera marked this pull request as ready for review November 25, 2024 17:17
zetaclient/tss/healthcheck.go Show resolved Hide resolved
zetaclient/tss/healthcheck.go Outdated Show resolved Hide resolved
@gartnera gartnera force-pushed the fix-connectedpeer-metrics branch from 3d164e6 to 69ca4a8 Compare November 25, 2024 17:29
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (4)
zetaclient/metrics/telemetry.go (2)

46-46: Add test coverage for the constructor

The initialization of knownPeers needs test coverage to ensure proper initialization of the TelemetryServer.

Would you like me to help generate unit tests for the constructor?

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 46-46: zetaclient/metrics/telemetry.go#L46
Added line #L46 was not covered by tests


72-82: Consider defensive copying in GetKnownPeers

While the implementation is thread-safe, returning a direct slice reference could lead to potential data races if the caller modifies the slice. Consider returning a copy.

func (t *TelemetryServer) GetKnownPeers() []peer.AddrInfo {
	t.mu.Lock()
	defer t.mu.Unlock()
-	return t.knownPeers
+	result := make([]peer.AddrInfo, len(t.knownPeers))
+	copy(result, t.knownPeers)
+	return result
}
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 72-75: zetaclient/metrics/telemetry.go#L72-L75
Added lines #L72 - L75 were not covered by tests


[warning] 78-81: zetaclient/metrics/telemetry.go#L78-L81
Added lines #L78 - L81 were not covered by tests

zetaclient/tss/healthcheck.go (2)

107-118: Add Unit Tests for connectedPeersCounter Function

The connectedPeersCounter function is crucial for tracking connected peers accurately. However, it currently lacks unit tests, as indicated by the static analysis tools. Implementing tests will enhance code reliability and ensure the function behaves as expected.

Would you like assistance in creating unit tests for this function?

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 107-117: zetaclient/tss/healthcheck.go#L107-L117
Added lines #L107 - L117 were not covered by tests


99-102: Add Unit Tests for knownPeersCounter Function

Similarly, the knownPeersCounter function is not covered by unit tests. To maintain high code quality and confidence in metric tracking, consider adding unit tests for this function.

Would you like assistance in creating unit tests for this function?

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 99-99: zetaclient/tss/healthcheck.go#L99
Added line #L99 was not covered by tests


[warning] 102-102: zetaclient/tss/healthcheck.go#L102
Added line #L102 was not covered by tests

🛑 Comments failed to post (4)
zetaclient/tss/service.go (1)

51-51: 💡 Codebase verification

Missing implementation of SetKnownPeers in interface implementations

Based on the verification results:

  • The Telemetry interface requires SetKnownPeers method
  • Only TelemetryServer in zetaclient/metrics/telemetry.go implements this method
  • Other potential implementations in zetaclient/tss/service.go and zetaclient/tss/healthcheck.go lack this required method

Please implement the SetKnownPeers method in all structs that implement the Telemetry interface to maintain interface compliance.

🔗 Analysis chain

LGTM! Verify implementation consistency.

The addition of SetKnownPeers method to the Telemetry interface aligns well with the PR objective of distinguishing between known and connected peers. The method signature is consistent with the existing SetConnectedPeers method.

Let's verify that all implementations of the Telemetry interface have been updated:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all implementations of the Telemetry interface and verify they implement SetKnownPeers

# Find all Go files that might implement Telemetry interface
ast-grep --pattern 'type $_ struct {
  $$$
}

func ($_ $_) SetP2PID($_) {
  $$$
}

func ($_ $_) SetConnectedPeers($_ []peer.AddrInfo) {
  $$$
}'

# Find all implementations of SetKnownPeers
ast-grep --pattern 'func ($_ $_) SetKnownPeers(peers []peer.AddrInfo) {
  $$$
}'

Length of output: 527


Script:

#!/bin/bash
# Let's find all files that might contain Telemetry interface implementations
rg -l "type.*struct.*SetP2PID|SetConnectedPeers" --type go

# Let's also check the interface definition to understand all required methods
rg "type Telemetry interface" -A 10 --type go

Length of output: 744


Script:

#!/bin/bash
# Let's check the implementation of SetKnownPeers in all potential implementation files
rg "func.*SetKnownPeers" -B 2 -A 5 zetaclient/tss/service.go zetaclient/tss/healthcheck.go

# Also check for any mock implementations
rg "type.*Mock.*Telemetry" -A 20 --type go

Length of output: 137

zetaclient/metrics/telemetry.go (2)

301-306: ⚠️ Potential issue

Fix incorrect error message in connectedPeersHandler

The error message incorrectly refers to "known peers" in the connected peers handler.

-		t.logger.Error().Err(err).Msg("Failed to marshal known peers")
+		t.logger.Error().Err(err).Msg("Failed to marshal connected peers")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		t.logger.Error().Err(err).Msg("Failed to marshal connected peers")
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}
	fmt.Fprintf(w, "%s", string(data))
}
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 301-305: zetaclient/metrics/telemetry.go#L301-L305
Added lines #L301 - L305 were not covered by tests


308-313: 🛠️ Refactor suggestion

Set Content-Type header for JSON response

The handler returns JSON data but doesn't set the appropriate Content-Type header.

func (t *TelemetryServer) knownPeersHandler(w http.ResponseWriter, _ *http.Request) {
+	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	peers := t.GetKnownPeers()
	data, err := json.Marshal(peers)
	if err != nil {
		t.logger.Error().Err(err).Msg("Failed to marshal known peers")
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}
	fmt.Fprintf(w, "%s", string(data))
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

func (t *TelemetryServer) knownPeersHandler(w http.ResponseWriter, _ *http.Request) {
	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	peers := t.GetKnownPeers()
	data, err := json.Marshal(peers)
	if err != nil {
		t.logger.Error().Err(err).Msg("Failed to marshal known peers")
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 308-313: zetaclient/metrics/telemetry.go#L308-L313
Added lines #L308 - L313 were not covered by tests

zetaclient/tss/healthcheck.go (1)

99-102: ⚠️ Potential issue

Incorrect Metric Updated in knownPeersCounter

In the knownPeersCounter function, the metric NumConnectedPeersMetric is being updated instead of NumKnownPeersMetric. This misalignment could lead to inaccurate tracking of known peers versus connected peers.

Please apply the following fix to ensure the correct metric is updated:

knownPeersCounter := func(_ context.Context, _ *ticker.Ticker) error {
	peers := server.GetKnownPeers()
-	p.NumConnectedPeersMetric.Set(float64(len(peers)))
+	p.NumKnownPeersMetric.Set(float64(len(peers)))
	p.Telemetry.SetKnownPeers(peers)
	return nil
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

	knownPeersCounter := func(_ context.Context, _ *ticker.Ticker) error {
		peers := server.GetKnownPeers()
		p.NumKnownPeersMetric.Set(float64(len(peers)))
		p.Telemetry.SetKnownPeers(peers)
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 99-99: zetaclient/tss/healthcheck.go#L99
Added line #L99 was not covered by tests


[warning] 102-102: zetaclient/tss/healthcheck.go#L102
Added line #L102 was not covered by tests

@gartnera gartnera added this pull request to the merge queue Nov 26, 2024
Merged via the queue into develop with commit ed2352f Nov 26, 2024
40 of 41 checks passed
@gartnera gartnera deleted the fix-connectedpeer-metrics branch November 26, 2024 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Skip changelog CI check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants