Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pode Watchdog Feature (work in progress) #1416

Draft
wants to merge 45 commits into
base: develop
Choose a base branch
from

Conversation

mdaneri
Copy link
Contributor

@mdaneri mdaneri commented Oct 12, 2024

Summary

This pull request introduces the Pode Watchdog feature, which allows users to monitor and manage processes or scripts running within their Pode server. The Watchdog automatically tracks process status and uptime, monitors file changes, and provides API endpoints for interacting with the monitored processes. Key functionalities include process monitoring, automatic restarts, session management during restarts/shutdowns, and remote control through REST APIs.

Key Features

  • Process Monitoring: Tracks the status, uptime, and performance of processes running within Pode using a NamedPipeStream.
  • File Monitoring: Automatically restarts processes when monitored files (such as configuration files) are modified.
  • Logging Support: Logs important events and errors for debugging and auditing purposes.
  • Automatic Restarts: Ensures that monitored processes are automatically restarted if they crash or exit unexpectedly.
  • Session Management During Restarts/Shutdowns: When the service is restarting or shutting down, it waits for any open sessions to terminate before proceeding. During this time, the default HTTP response code is set to 503 (Service Unavailable).
  • Configurable Service Recovery: After a failure, the service will wait for a specified amount of time ($RestartServiceAfter) before attempting to restart the process. The process can be restarted a maximum of $MaxNumberOfRestarts times. The restart counter is reset after a successful run of $ResetFailCountAfter minutes.
  • 503 Service Management: The Watchdog can enable or disable 503 status responses for the monitored process, providing more control over service availability during critical operations.

Usage Example

This feature allows users to set up and monitor a process with a Pode server and interact with the process via REST API routes. Below is a sample setup:

Start-PodeServer {
    # Define an HTTP endpoint
    Add-PodeEndpoint -Address localhost -Port 8082 -Protocol Http

    # Set up Watchdog logging
    New-PodeLoggingMethod -File -Name 'watchdog' -MaxDays 4 | Enable-PodeErrorLogging

    # Enable Watchdog monitoring for a script process
    Enable-PodeWatchdog -FilePath './scripts/myProcess.ps1' -FileMonitoring -FileExclude '*.log' -Name 'myProcessWatchdog'

    # Route to check process status
    Add-PodeRoute -Method Get -Path '/monitor/status' -ScriptBlock {
        Write-PodeJsonResponse -Value (Get-PodeWatchdogProcessMetric -Name 'myProcessWatchdog' -Type Status)
    }

    # Route to restart the process
    Add-PodeRoute -Method Post -Path '/cmd/restart' -ScriptBlock {
        Write-PodeJsonResponse -Value @{success = (Set-PodeWatchdogProcessState -Name 'myProcessWatchdog' -State Restart)}
    }
}

Documentation

Full documentation for the Pode Watchdog feature has been included, covering:

  • Feature Overview
  • Key Features
  • Setup Instructions
  • Process Control and Monitoring Commands
  • REST API Integration

This feature needs #1387 to be complete

@mdaneri mdaneri changed the title Pode Watchdog Feature Pode Watchdog Feature (work in progress) Oct 12, 2024
@@ -40,7 +40,6 @@
function Add-PodeRunspace {
param(
[Parameter(Mandatory = $true)]
[ValidateSet('Main', 'Signals', 'Schedules', 'Gui', 'Web', 'Smtp', 'Tcp', 'Tasks', 'WebSockets', 'Files', 'Timers')]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this type of validation? This feels like a good case for Enums.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will go away with the asyncrout #1349

src/Public/Core.ps1 Outdated Show resolved Hide resolved
src/Public/Core.ps1 Outdated Show resolved Hide resolved
Comment on lines +3790 to +3791
# If the object is neither a hashtable, ordered dictionary, nor array, return it as-is
return $InputObject

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we should always cast to ConvertTo-PodeConcurrentStrucutre? That may reduce the cognitive overhead and just assume that we always cast. Maybe add another check at the top to return something that's already been cast to keep the cost low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this function primarly for #1391 (PodeState threadsafe)
The reason of this function is to convert any hashtable or array that has to be thread safe to as safe-thread object
System.Collections.Concurrent.ConcurrentDictionary for hashtable and System.Collections for array.

It's not meant to be used for any hashtable or array

function Set-PodeWatchdogHearthbeatStatus {
param(
[Parameter(Mandatory = $true)]
[ValidateSet('Starting', 'Restarting', 'Running', 'Undefined', 'Stopping', 'Stopped', 'Offline')]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place I think it would be helpful to have enums and maybe a watchdog class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll try with Enum. But normally are a pain in poweshell

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually have an Enum folder and with a .ps1 which I dot load or toss at the top when the module is compiled. If it's a function that's only ever used by internal developers then it may not be worth the hassle.

@mdaneri mdaneri marked this pull request as draft October 22, 2024 01:48
@mdaneri mdaneri marked this pull request as ready for review November 2, 2024 17:26
@mdaneri mdaneri marked this pull request as draft November 2, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants