This project aims to set up a monitoring infrastructure for a multi-machine environment using various tools and services. The infrastructure consists of five machines with different configurations, each serving specific purposes. The monitoring setup includes the utilization of Wazuh, Splunk, Uptime Robot, New Relic, and PagerDuty for comprehensive monitoring and incident management.
-
CONTROL Machine:
- OS Image: Ubuntu Server 22.04 LTS
- AWS Instance Type: c5.large
- Public IP Address: Elastic IP
- Private IP Address: 172.31.0.100
- Volume: 30 GiB
- Purpose: Central control machine for managing monitoring infrastructure.
-
LUX1 Machine:
- OS Image: Ubuntu Server 22.04 LTS
- AWS Instance Type: t3.small
- Public IP Address: Elastic IP
- Private IP Address: 172.31.0.101
- Volume: 8 GiB
- Purpose: Hosting Apache 2 HTTP server and being monitored by Wazuh.
-
LUX2 Machine:
- OS Image: Red Hat Enterprise Linux 9
- AWS Instance Type: t3.small
- Public IP Address: Elastic IP
- Private IP Address: 172.31.0.102
- Volume: 10 GiB
- Purpose: Hosting NGINX with HTTPS and being monitored by Wazuh.
-
WIN22 Machine:
- OS Image: Microsoft Windows Server 2022
- AWS Instance Type: c5.large
- Public IP Address: Elastic IP
- IP Address: 172.31.0.110
- Volume: 30 GiB
- Purpose: Hosting IIS with HTTPS and being monitored by Wazuh.
-
WIN11 Machine:
- OS Image: Windows 11
- AWS Instance Type: c5.large
- Public IP Address: Elastic IP
- Private IP Address: 172.31.0.111
- Volume: 64 GiB
- Purpose: Being monitored by Wazuh.
-
Uptime Robot:
- Purpose: Monitoring the availability of all machines and services.
-
New Relic:
- Purpose: Comprehensive monitoring of all machines to gather performance metrics and insights.
-
Wazuh:
- Purpose: Intrusion detection and monitoring solution, integrated with various machines and services for real-time threat detection.
-
Splunk:
- Purpose: Log management and analysis tool, used to monitor the machines and analyze logs for security and performance issues.
-
PagerDuty:
- Purpose: Incident management and alerting platform, integrated with Wazuh to receive and manage alerts triggered by security events.
-
Setting up Machines:
- Launch the AWS instances for each machine according to the specified configurations.
- Assign Elastic IP addresses to ensure static public IP addresses for easy access and management.
-
Installing Monitoring Tools:
- On the CONTROL Machine:
- Install and configure Wazuh following the provided documentation.
- Set up Splunk for log management and analysis, ensuring it's properly integrated with Wazuh.
- On each monitored machine (Ubuntu, Red Hat, Windows Server, Windows 11):
- Install and configure the respective Wazuh agent.
- Install and set up any additional software required (e.g., Apache, NGINX, IIS) according to the machine's purpose.
- On the CONTROL Machine:
-
Integrating Services:
- PagerDuty Integration:
- Configure PagerDuty to receive alerts from Wazuh for incident management.
- Splunk Integration:
- Ensure Splunk is configured to ingest logs from all machines for centralized log analysis.
- New Relic Integration:
- Set up New Relic to monitor all machines and gather performance metrics. Configure alerts for critical thresholds.
- PagerDuty Integration:
-
Configuring Uptime Robot:
- Log in to Uptime Robot and create monitors for each machine and service to ensure their availability.
- Configure alert notifications to promptly address any downtime incidents.
-
Maintenance and Monitoring:
- Regularly monitor the dashboards provided by New Relic, Splunk, and Wazuh for any anomalies or security threats.
- Respond to alerts generated by PagerDuty promptly, investigating any incidents and taking necessary actions.
- Perform routine maintenance tasks such as updating software, reviewing logs, and optimizing configurations to ensure optimal performance and security.
- Conduct periodic reviews of the monitoring setup to identify areas for improvement and optimization.
- [Gonçalo Resendes]