Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(system): create a package to monitor component containers #7094

Merged
merged 54 commits into from
Jul 12, 2024

Conversation

mebasoglu
Copy link
Contributor

@mebasoglu mebasoglu commented May 22, 2024

Description

This PR adds a new package autoware_component_monitor under the system. It aims to attach existing component containers and monitor their CPU and memory usages.

Related links

Issue:

Corresponding message type PR:

Tests performed

Logging simulator was used with the sample data on Autoware tutorials. The composable node was attached to pointcloud_container and published the system usage of process. The launch file inside the package was used to attach. Here is a sample topic output:

---
header:
  stamp:
    sec: 1718094543
    nanosec: 106255881
  frame_id: component_monitor
pid: 225916
cpu_usage_percentage: 53.29999923706055
total_memory_bytes: 67104985088
free_memory_bytes: 7425822720
used_memory_bytes: 759532
memory_usage_percentage: 1.2000000476837158

Here is a sample visualization of pointcloud_container with the sample bag on Foxglove:

component_monitor

Notes for reviewers

The CPU monitor inside the system_monitor package uses boost::process to run mpstat command, gets the stdout and parses it to get necessary information.

The same mechanism was also used in this package. The top package was preferred to get CPU and memory usage.
top -b -d 0.1 -n 1 -p PID outputs the memory usage as below:

top - 15:48:04 up  4:50,  1 user,  load average: 6,59, 4,29, 3,46
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3,8 us,  0,9 sy,  0,0 ni, 94,5 id,  0,4 wa,  0,0 hi,  0,4 si,  0,0 st
MiB Mem :  63996,4 total,   5108,0 free,  40659,5 used,  18228,9 buff/cache
MiB Swap:   2048,0 total,    906,5 free,   1141,5 used.  21457,1 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 573147 meb       20   0   16,6g   2,0g 549812 S   6,7   3,2   0:05.55 component_conta
  • %MEM is calculated with the RES field, which is the physical memory usage. These fields are parsed inside the node for memory usage.

  • %CPU field is parsed to get CPU usage rate.

  • The LC_NUMERIC environment variable determines the output format of floating numbers. In my computer it is LC_NUMERIC=tr_TR.UTF-8 and because of that top uses , as the separator for floating numbers. To ensure that it is always ., this environment variable is overridden.

env["LC_NUMERIC"] = "en_US.UTF-8"; // To make sure that decimal separator is a dot.

  • A while (rclcpp::ok()) loop was used with a rclcpp::Rate to monitor periodically. The loop is inside a try-catch block so that if the node dies, it won't crash the whole component.

Interface changes

ROS Topic Changes

Topic Name Type Direction Update Description
component_system_usage autoware_internal_msgs/SystemUsage Publish The system usage for component container

A message type was created as follows (autowarefoundation/autoware_internal_msgs#12):

std_msgs/Header header

# Process identifier
uint32 pid

# CPU usage metrics
float32 cpu_usage_percentage

# Memory usage metrics for the whole system
uint64 total_memory_bytes
uint64 free_memory_bytes

# Memory usage metrics for the process
uint64 used_memory_bytes
float32 memory_usage_percentage

It is currently planned to have this message inside autoware_internal_msgs.

ROS Parameter Changes

Currently, there aren't any parameters.

Effects on system behavior

It doesn't affect the system. The node is attached to an existing container and publishes a message if there is a subscriber.

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

  • The PR follows the pull request guidelines.
  • The PR has been properly tested.
  • The PR has been reviewed by the code owners.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

  • There are no open discussions, or they are tracked via tickets.
  • The PR is ready for merge.

After all checkboxes are checked, anyone who has write access can merge the PR.

@mebasoglu mebasoglu self-assigned this May 22, 2024
@mebasoglu mebasoglu force-pushed the memin/dev/system-usage branch from e53eb7a to feae4bf Compare May 22, 2024 09:07
@github-actions github-actions bot added the component:system System design and integration. (auto-assigned) label May 22, 2024
@mebasoglu mebasoglu requested a review from xmfcx May 22, 2024 09:28
@mebasoglu mebasoglu force-pushed the memin/dev/system-usage branch 2 times, most recently from c7dd825 to f53557d Compare May 23, 2024 12:46
@mebasoglu mebasoglu marked this pull request as ready for review May 23, 2024 12:58
@mebasoglu mebasoglu force-pushed the memin/dev/system-usage branch 2 times, most recently from 4cf87ab to d9e35bb Compare May 28, 2024 07:15
@mebasoglu mebasoglu marked this pull request as draft June 5, 2024 07:32
@mebasoglu mebasoglu force-pushed the memin/dev/system-usage branch 2 times, most recently from afa00f3 to 1f40ad3 Compare June 5, 2024 11:02
@mebasoglu mebasoglu marked this pull request as ready for review June 5, 2024 11:04
@xmfcx xmfcx force-pushed the memin/dev/system-usage branch from 1f40ad3 to 612c786 Compare June 5, 2024 11:22
@xmfcx xmfcx added run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) run:deploy-docs Mark for deploy-docs action generation. (used-by-ci) labels Jun 5, 2024
@xmfcx xmfcx closed this Jun 5, 2024
@xmfcx xmfcx reopened this Jun 5, 2024
Copy link

github-actions bot commented Jun 5, 2024

@xmfcx
Copy link
Contributor

xmfcx commented Jun 5, 2024

Could you add a readme file?

@xmfcx xmfcx added run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) and removed run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) labels Jun 5, 2024
@mebasoglu mebasoglu force-pushed the memin/dev/system-usage branch from 612c786 to a991b3e Compare June 5, 2024 13:48
@github-actions github-actions bot added the type:documentation Creating or refining documentation. (auto-assigned) label Jun 5, 2024
mebasoglu and others added 24 commits July 12, 2024 19:14
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: M. Fatih Cırıt <[email protected]>
Signed-off-by: M. Fatih Cırıt <[email protected]>
Signed-off-by: M. Fatih Cırıt <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
Signed-off-by: Mehmet Emin BAŞOĞLU <[email protected]>
@xmfcx xmfcx force-pushed the memin/dev/system-usage branch from 7f920fc to 39dc9c5 Compare July 12, 2024 16:14
M. Fatih Cırıt added 2 commits July 12, 2024 19:19
Signed-off-by: M. Fatih Cırıt <[email protected]>
Signed-off-by: M. Fatih Cırıt <[email protected]>
Copy link
Contributor

@xmfcx xmfcx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've tested it with the rosbag replay simulation and loaded with

ros2 launch autoware_launch logging_simulator.launch.xml map_path:=$HOME/autoware_map/sample-map-rosbag vehicle_model:=sample_vehicle sensor_model:=sample_sensor_kit

ros2 component load /pointcloud_container autoware_component_monitor autoware::component_monitor::ComponentMonitor -p publish_rate:=10.0 --node-namespace /pointcloud_container

ros2 topic echo /pointcloud_container/component_monitor/component_system_usage

and it works!

@xmfcx xmfcx merged commit 0a3e6d8 into autowarefoundation:main Jul 12, 2024
36 checks passed
Ariiees pushed a commit to Ariiees/autoware.universe that referenced this pull request Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:system System design and integration. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) run:deploy-docs Mark for deploy-docs action generation. (used-by-ci) type:documentation Creating or refining documentation. (auto-assigned)
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants