Skip to content

Commit

Permalink
feat(system): create a package to monitor component containers (autow…
Browse files Browse the repository at this point in the history
…arefoundation#7094)

Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
  • Loading branch information
mebasoglu authored and Ariiees committed Jul 22, 2024
1 parent 2dfa8ff commit 479cdda
Show file tree
Hide file tree
Showing 10 changed files with 586 additions and 0 deletions.
31 changes: 31 additions & 0 deletions system/autoware_component_monitor/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
cmake_minimum_required(VERSION 3.8)
project(autoware_component_monitor)

find_package(autoware_cmake REQUIRED)
autoware_package()

find_package(Boost REQUIRED COMPONENTS
filesystem
)

ament_auto_add_library(${PROJECT_NAME} SHARED
src/component_monitor_node.cpp
)
target_link_libraries(${PROJECT_NAME} ${Boost_LIBRARIES})

rclcpp_components_register_node(${PROJECT_NAME}
PLUGIN "autoware::component_monitor::ComponentMonitor"
EXECUTABLE ${PROJECT_NAME}_node
)

if(BUILD_TESTING)
ament_add_ros_isolated_gtest(test_unit_conversions test/test_unit_conversions.cpp)
target_link_libraries(test_unit_conversions ${PROJECT_NAME})
target_include_directories(test_unit_conversions PRIVATE src)
endif()

ament_auto_package(
INSTALL_TO_SHARE
config
launch
)
84 changes: 84 additions & 0 deletions system/autoware_component_monitor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# autoware_component_monitor

The `autoware_component_monitor` package allows monitoring system usage of component containers.
The composable node inside the package is attached to a component container, and it publishes CPU and memory usage of
the container.

## Inputs / Outputs

### Input

None.

### Output

| Name | Type | Description |
| -------------------------- | -------------------------------------------------- | ---------------------- |
| `~/component_system_usage` | `autoware_internal_msgs::msg::ResourceUsageReport` | CPU, Memory usage etc. |

## Parameters

### Core Parameters

{{ json_to_markdown("system/autoware_component_monitor/schema/component_monitor.schema.json") }}

## How to use

Add it as a composable node in your launch file:

```xml

<launch>
<group>
<push-ros-namespace namespace="your_namespace"/>
...

<load_composable_node target="$(var container_name)">
<composable_node pkg="autoware_component_monitor"
plugin="autoware::component_monitor::ComponentMonitor"
name="component_monitor">
<param from="$(find-pkg-share autoware_component_monitor)/config/component_monitor.param.yaml"/>
</composable_node>
</load_composable_node>

...
</group>
</launch>
```

### Quick testing

You can test the package by running the following command:

```bash
ros2 component load <container_name> autoware_component_monitor autoware::component_monitor::ComponentMonitor -p publish_rate:=10.0 --node-namespace <namespace>

# Example usage
ros2 component load /pointcloud_container autoware_component_monitor autoware::component_monitor::ComponentMonitor -p publish_rate:=10.0 --node-namespace /pointcloud_container
```

## How it works

The package uses the `top` command under the hood.
`top -b -n 1 -E k -p PID` command is run at 10 Hz to get the system usage of the process.

- `-b` activates the batch mode. By default, `top` doesn't exit and prints to stdout periodically. Batch mode allows
exiting the program.
- `-n` number of times should `top` prints the system usage in batch mode.
- `-p` specifies the PID of the process to monitor.
- `-E k` changes the memory unit in the summary section to KiB.

Here is a sample output:

```text
top - 13:57:26 up 3:14, 1 user, load average: 1,09, 1,10, 1,04
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0,0 us, 0,8 sy, 0,0 ni, 99,2 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem : 65532208 total, 35117428 free, 17669824 used, 12744956 buff/cache
KiB Swap: 39062524 total, 39062524 free, 0 used. 45520816 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3352 meb 20 0 2905940 1,2g 39292 S 0,0 2,0 23:24.01 awesome
```

We get 5th, 8th fields from the last line, which are RES, %CPU respectively.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/**:
ros__parameters:
publish_rate: 5.0 # Hz
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<launch>
<arg name="param_file" default="$(find-pkg-share autoware_component_monitor)/config/component_monitor.param.yaml"/>

<node_container pkg="rclcpp_components" exec="component_container_mt" name="component_monitor_container" namespace="/">
<composable_node pkg="autoware_component_monitor" plugin="autoware::component_monitor::ComponentMonitor" name="component_monitor">
<param from="$(var param_file)"/>
</composable_node>
</node_container>
</launch>
25 changes: 25 additions & 0 deletions system/autoware_component_monitor/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
<?xml version="1.0"?>
<?xml-model href="http://download.ros.org/schema/package_format3.xsd" schematypens="http://www.w3.org/2001/XMLSchema"?>
<package format="3">
<name>autoware_component_monitor</name>
<version>0.0.0</version>
<description>A ROS 2 package to monitor system usage of component containers.</description>
<maintainer email="[email protected]">Mehmet Emin Başoğlu</maintainer>
<license>Apache-2.0</license>

<buildtool_depend>ament_cmake_auto</buildtool_depend>
<buildtool_depend>autoware_cmake</buildtool_depend>

<depend>autoware_internal_msgs</depend>
<depend>libboost-filesystem-dev</depend>
<depend>rclcpp</depend>
<depend>rclcpp_components</depend>

<test_depend>ament_cmake_ros</test_depend>
<test_depend>ament_lint_auto</test_depend>
<test_depend>autoware_lint_common</test_depend>

<export>
<build_type>ament_cmake</build_type>
</export>
</package>
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Parameters for the Component Monitor node",
"type": "object",
"definitions": {
"component_monitor": {
"type": "object",
"properties": {
"publish_rate": {
"type": "number",
"default": "5.0",
"description": "Publish rate in Hz"
}
},
"required": ["publish_rate"]
}
},
"properties": {
"/**": {
"type": "object",
"properties": {
"ros__parameters": {
"$ref": "#/definitions/component_monitor"
}
},
"required": ["ros__parameters"]
}
},
"required": ["/**"]
}
177 changes: 177 additions & 0 deletions system/autoware_component_monitor/src/component_monitor_node.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
// Copyright 2024 The Autoware Foundation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include "component_monitor_node.hpp"

#include "unit_conversions.hpp"

#include <rclcpp/rclcpp.hpp>

#include <autoware_internal_msgs/msg/resource_usage_report.hpp>

#include <boost/process.hpp>

#include <cctype>
#include <cstdint>
#include <exception>
#include <functional>
#include <sstream>
#include <string>
#include <unordered_map>
#include <utility>
#include <vector>

namespace autoware::component_monitor
{
ComponentMonitor::ComponentMonitor(const rclcpp::NodeOptions & node_options)
: Node("component_monitor", node_options), publish_rate_(declare_parameter<double>("publish_rate"))
{
usage_pub_ =
create_publisher<ResourceUsageReport>("~/component_system_usage", rclcpp::SensorDataQoS());

// Make sure top ins installed and is in path
const auto path_top = boost::process::search_path("top");
if (path_top.empty()) {
RCLCPP_ERROR_STREAM(get_logger(), "Couldn't find 'top' in path.");
rclcpp::shutdown();
}

// Get the PID of the current process
int pid = getpid();

environment_ = boost::this_process::environment();
environment_["LC_NUMERIC"] = "en_US.UTF-8";

on_timer_tick_wrapped_ = std::bind(&ComponentMonitor::on_timer_tick, this, pid);

timer_ = rclcpp::create_timer(
this, get_clock(), rclcpp::Rate(publish_rate_).period(), on_timer_tick_wrapped_);
}

void ComponentMonitor::on_timer_tick(const int pid) const
{
if (usage_pub_->get_subscription_count() == 0) return;

try {
auto usage_msg = pid_to_report(pid);
usage_msg.header.stamp = this->now();
usage_msg.pid = pid;
usage_pub_->publish(usage_msg);
} catch (std::exception & e) {
RCLCPP_ERROR(get_logger(), "%s", e.what());
} catch (...) {
RCLCPP_ERROR(get_logger(), "An unknown error occurred.");
}
}

ComponentMonitor::ResourceUsageReport ComponentMonitor::pid_to_report(const pid_t & pid) const
{
const auto std_out = run_system_command("top -b -n 1 -E k -p " + std::to_string(pid));

const auto fields = parse_lines_into_words(std_out);

ResourceUsageReport report;
report.cpu_cores_utilized = std::stof(fields.back().at(8)) / 100.0f;
report.total_memory_bytes = unit_conversions::kib_to_bytes(std::stoul(fields.at(3).at(3)));
report.free_memory_bytes = unit_conversions::kib_to_bytes(std::stoul(fields.at(3).at(5)));
report.process_memory_bytes = parse_memory_res(fields.back().at(5));

return report;
}

std::stringstream ComponentMonitor::run_system_command(const std::string & cmd) const
{
int out_fd[2];
if (pipe2(out_fd, O_CLOEXEC) != 0) {
RCLCPP_ERROR_STREAM(get_logger(), "Error setting flags on out_fd");
}
boost::process::pipe out_pipe{out_fd[0], out_fd[1]};
boost::process::ipstream is_out{std::move(out_pipe)};

int err_fd[2];
if (pipe2(err_fd, O_CLOEXEC) != 0) {
RCLCPP_ERROR_STREAM(get_logger(), "Error setting flags on err_fd");
}
boost::process::pipe err_pipe{err_fd[0], err_fd[1]};
boost::process::ipstream is_err{std::move(err_pipe)};

boost::process::child c(
cmd, environment_, boost::process::std_out > is_out, boost::process::std_err > is_err);
c.wait();

if (c.exit_code() != 0) {
std::ostringstream os;
is_err >> os.rdbuf();
RCLCPP_ERROR_STREAM(get_logger(), "Error running command: " << os.str());
}

std::stringstream sstream;
sstream << is_out.rdbuf();
return sstream;
}

ComponentMonitor::VecVecStr ComponentMonitor::parse_lines_into_words(
const std::stringstream & std_out)
{
VecVecStr fields;
std::string line;
std::istringstream input{std_out.str()};

while (std::getline(input, line)) {
std::istringstream iss_line{line};
std::string word;
std::vector<std::string> words;

while (iss_line >> word) {
words.push_back(word);
}

fields.push_back(words);
}

return fields;
}

std::uint64_t ComponentMonitor::parse_memory_res(const std::string & mem_res)
{
// example 1: 12.3g
// example 2: 123 (without suffix, just bytes)
static const std::unordered_map<char, std::function<std::uint64_t(double)>> unit_map{
{'k', unit_conversions::kib_to_bytes<double>}, {'m', unit_conversions::mib_to_bytes<double>},
{'g', unit_conversions::gib_to_bytes<double>}, {'t', unit_conversions::tib_to_bytes<double>},
{'p', unit_conversions::pib_to_bytes<double>}, {'e', unit_conversions::eib_to_bytes<double>}};

if (std::isdigit(mem_res.back())) {
return std::stoull(mem_res); // Handle plain bytes without any suffix
}

// Extract the numeric part and the unit suffix
double value = std::stod(mem_res.substr(0, mem_res.size() - 1));
char suffix = mem_res.back();

// Find the appropriate function from the map
auto it = unit_map.find(suffix);
if (it != unit_map.end()) {
const auto & conversion_function = it->second;
return conversion_function(value);
}

// Throw an exception or handle the error as needed if the suffix is not recognized
throw std::runtime_error("Unsupported unit suffix: " + std::string(1, suffix));
}

} // namespace autoware::component_monitor

#include <rclcpp_components/register_node_macro.hpp>
RCLCPP_COMPONENTS_REGISTER_NODE(autoware::component_monitor::ComponentMonitor)
Loading

0 comments on commit 479cdda

Please sign in to comment.