Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge DirectMemory usage (Non Heap) and Threads #77

Open
Pill30 opened this issue Oct 23, 2024 · 4 comments
Open

Huge DirectMemory usage (Non Heap) and Threads #77

Pill30 opened this issue Oct 23, 2024 · 4 comments

Comments

@Pill30
Copy link

Pill30 commented Oct 23, 2024

Test

  • JMETER 5.6.2
  • JDK 21.0.4
  • http/2 (400) – No Async Controller
  • 20Mins
  • 400 Threads
  • 400 Ramp (6 Mins 40 Seconds)
  • Note: CMD LINE
  • 64GB Server
  • set HEAP=-Xms1g -Xmx30g -XX:MaxMetaspaceSize=256m
  • 10 (+-3) Second ThinkTime
  • Plugin: Latest (v2.0.5)

image

Results:
image

Heap Size (Extended to): ~8.9GB (Note: keeps increasing even though stable load)
Total Java Memory Used: ~28.6GB (Note: keeps increasing even though stable load)
Threads: 5317!!

This occurs with/without Async Controller
Issue discussed here by serputko Anton Serputko https://www.youtube.com/watch?v=SCrnKbeVXUg&t=7500s

Note: The box eventually runs out of Memory and you receive the message:
image
So it looks like there might be a Memory leak as well.

########################################################################

Note: Repeating the same test with http1.1 Samplers works as expected:
image

image

Heap Size (Extended to): ~9.7GB (Remains consistent with stable load)
Total Java Memory Used: ~9.7GB (Remains consistent with stable load)
Threads: 423 (aligns to JMeter Threads)

@3dgiordano
Copy link
Contributor

Hi @Pill30
The testing and execution on http2 not is the same as a mono thread http client execution on http 1.1 from JMeter.

Http2 allocate memory on each JMeter thread and create others threads needed to manage the connection polls and parallel workers over each of this threads (that is for the asynchronous and concurrent execution and for the multiplexing connections).

You need to think like a each JMeter thread is like a instance of browser, like a browser, for process a concurrent networking need to create multiple thread for each asset on a page, and probably if you launch 400 browsers on that machine (400 JMeter threads), that take a lot of memory.

The way to load with http2, you need to maintain the JMeter thread live like the time of a opened browser, and use the parallel execution with the http2 async controller. You need to launch less JMeter threads and "tune" the parallel controller and the concurrency to simulate the parallel execution of a browser with support of http2.
http1.x run on one thread, but http2 for multiplexing and concurrent execution need a pool of threads for each "user" simulation (a JMeter thread).

If you idea is to simulate 400 concurrent http execution, use less threads and more http2 samplers in a http2 async controller.
You can also simulate that with only 1 JMeter thread (400 concurrent http2 request using multiplexing), and you can handle also much more).

As you can see in Anton's video, he can handle 1200rps (in an old version of the plugin, an internal alpha version).

The problem with handling 1200rps is that, for example, in Anton's case, each request was for content of up to 20MB, so if I multiply 1200rps it is 24GB/s of memory allocation that Java must handle (yes, per second), and it was one of the main reasons for Anton's OOM (big http response from each requests), it ran faster than the jvm could handle. The GC take some time and you need to add some "pauses" or in some moments to force the GC for free the memory, because in an extreme load test the jvm doesn't have the time to do it on time and the OOM can happen.

We have incorporated in the most recent versions some changes to try to avoid OOM problems due to allocating memory too quickly and not having time for the jvm to release it. It is possible to mitigate this by incorporating some pauses to allow the GC to act, but it can happen to you if you don't press the brake.

The ideal is always to calculate how much the response of an http request "weighs" in memory and the time that it remains alive. Also, if you have an active listener, it can cause OOM problems because JMeter keeps a lot of information in memory that you can later go to consult like in a ViewResult tree. The default request buffer limit on viewresult tree try to handle that but that requests takes memory also (and the listener of the view result tree "take and free" lot of memory on a heavy load).

The default heap and the native fast access memory that jetty uses requires that the jvm be tuned better if you want to run heavy load simulations, you need to do that, and always take in consideration some of the advice mentioned on this response, you need to construct a more suitable test plan for http2 execution.

Try my recommendations, use less threads and more async controller, Anton's video you how and also gives some tips on how to tune the jvm and blazemeter jvm to run better (The vast majority of the tuning recommendations Anton presents were provided by me).

Feel free to tell us about your experience trying to set up a test plan oriented towards fewer threads and more use of the http2 async controller.

@Pill30
Copy link
Author

Pill30 commented Nov 4, 2024

Thanks for the response.
FYI: As mentioned above, I get similar behavior with/without using the Async Controller.

The issue (if i understand correctly) is that it's using Direct Memory that is outside of the heap and as such is not subject to GC. The memory never gets released and keep growing until you get the error "Cannot reserve 8192 bytes of direct buffer memory"

@3dgiordano
Copy link
Contributor

Hi @Pill30
The plugin use native heap and if the memory not is enough, use java heap (I think from v2.0.3).

By default JMeter provides a memory configuration necessary to be able to use its UI, but it does not have a configuration suitable for load testing, much less adapted to Jetty requirements.

Jetty has an implementation that makes extensive use of Java NIO (Java New Input/Output API), and for this reason it is necessary to adapt the JVM configuration for better memory usage. For more information about the Jetty I/O architecture, visit the site related here.

The important part to know is that since Java NIO is used, the memory heap must be correctly configured using the -XX:MaxDirectMemorySize=size argument for the JVM. For more information about the argument, see the java command documentation here

It may also be necessary to increase the values ​​of the -XX:MaxMetaspaceSize=size argument to a higher value. Keep in mind to make the necessary adjustments in case your test requires it. More information about the argument here.

I share with you some finetuned tricks here

By default the plugin is configured to have a certain general behavior, which is desirable to be configured and adjusted for the load test requirements.
To see all available properties and their default values

https://github.com/Blazemeter/jmeter-http2-plugin?tab=readme-ov-file#properties

An example of the most common JMeter properties and some of the possible reference values (These are the properties that I use as a reference to then begin to fine-tune them)

HTTPSampler.response_timeout=240000
httpJettyClient.idleTimeout=60000
httpJettyClient.maxBufferSize=22214621
httpJettyClient.byteBufferPoolFactor=4
httpJettyClient.maxConnectionsPerDestination=100
httpJettyClient.maxRequestsPerConnection=200
httpJettyClient.maxConcurrentAsyncInController=1000

The reasons for each assignment will be explained below.

HTTPSampler.response_timeout: Maximum waiting time of request without timeout defined, in milliseconds

By default, it does not run with a timeout, so if the server is blocked without providing a response, the connections are not released and neither is the test.
It is advisable to assign a prudent timeout and thus report everything that is an anomaly.
In this way, if it should never happen that a response takes more than a minute, define the timeout at 60000, in this way you will be able to know that this situation is occurring. And importantly, you will not be blocking that execution if the response takes minutes to arrive.

httpJettyClient.idleTimeou: the max time, in milliseconds, a connection can be idle

It is always advisable not to have connections waiting for too long.
The connection pool keeps them available until another request on the same channel requires them, but keeping connections open also has resource costs.
It is recommended to assign a prudent value for the idle connection time. If you know the times when a connection should be reused, assign an appropriate timeout to allow automatic release of resources.

httpJettyClient.maxBufferSize: Maximum size of the downloaded resources in bytes

The buffer is filled dynamically, but it has a maximum limit that must be assigned.
By default, it is 20MB, but it is recommended to increase it if it is known that there may be responses that exceed this default value.

httpJettyClient.byteBufferPoolFactor: Factor number used in the allocation of memory in the buffer of http client

The Jetty client needs to know this factor to know what the buffer allocation "growth" mechanism will be.
The buffer is reused between requests, so if there is not much variation between response sizes, the buffer size will remain constant.
If the buffer needs to be grown, Jetty uses a parts calculation based on the maximum total buffer size and will grow depending on how many parts the request requests.
Setting a value of 1 would indicate that the buffer is created directly at its maximum capacity, so only one reservation operation will be performed.
By default the factor is set to 4, so Jetty will take the maximum buffer size and divide it by 4 and the memory reservation increase will be based on that resulting value.
Reserving more memory than needed also has its drawbacks, so it is advisable to test. If the maximum buffer size is known based on average requests, and a value very close to the average is assigned as the maximum, it is possible to assign a factor of 1 and thus minimize the amount of memory reservations.
Note that indicating 1 will make the maximum reservation by default, so it can cause excessive use if a very high maximum buffer value is assigned as the maximum possible is unknown.

httpJettyClient.maxConnectionsPerDestination: Sets the max number of connections to open to each destinations

Each destination is a particular host, so this limit indicates for the current thread, and that destination, how many connections it will establish by default.
By default the plugin starts 1 connection, which is suitable for when it is not clear how many connections each destination should have as a maximum.
Typically each host has a connection limit per source, as well as other limits per user session for example.
Try to set a maximum according to how the server under test is configured.
If it is not known, try establishing 100 connections and analyze the behavior.

httpJettyClient.maxRequestsPerConnection: Set the maximum number of request per connection.

Many web servers have a logic for closing connections when a certain number of http requests is reached per connection.
In such cases, when this limit is reached, the server disconnects the client by sending a GOAWAY and thus generating a cascade of reconnections, renegotiations and sending of queued requests.
If you do not want to test these limits, it is recommended to assign this value to the maximum value supported by the web server before it sends a disconnection with GOAWAY.
By default it is set to 100, since many old ngxinx servers are still running (pre-1.19.7) and have 100 as their set level. New servers have 1000 as their maximum supported value.
To be more certain, check your server configuration and directives. Here you can find the ones related to nginx https://nginx.org/en/docs/http/ngx_http_v2_module.html#directives

This is property related with HTTP1.1 and also HTTP2, on HTTP1.1 use pipeline, and HTTP2 use multiplexing.

httpJettyClient.maxConcurrentAsyncInController: Maximum number of concurrent http2 samplers inside a HTTP2 Async Controller

The recommendation here is to try to find a balance between simulation reality and limits allowed by the server.

Also take into consideration the httpJettyClient.maxRequestsPerConnection property, since if there are limits on the number of requests on the backend side, you should align yourself with it.
If, for example, I have a controller that only makes requests to a certain host, it doesn't make sense to assign an upper limit, otherwise what will happen is a large queue of requests while reconnections occur.

A web page usually does not have more than 100 simultaneous requests for resources on the same host, that is why 100 is defined as the maximum limit and in some way tries to mitigate the possible problems that occur with the limits of some requests to the servers regarding the amount of requests per connection.
You can try increasing or decreasing the number of "live" requests in concurrency. If for example there are memory limitations for the thread modeling that you are performing, you can lower the level of concurrency through the asynchronous mechanism to have fewer requests running in parallel using memory.
Simply when the limit is reached, elements are queued and requests are assigned to be performed as the active requests are finished.

httpJettyClient.minThreads and httpJettyClient.maxThreads: Minimum and Maximum number of threads per http client

The default number of threads that jetty should create in its pool when instantiating our http client, as well as their maximum amount.
Increasing the number of initial threads can help in building more complex high-load test cases and defining a maximum limit can help contain excessive resource usage.
For more information on the Jetty HTTP client ThreadPool model, refer to its documentation
https://jetty.org/docs/jetty/11/programming-guide/arch/threads.html#thread-pool


Try to use the provided example of properties and use the jvm args like -XX:MaxDirectMemorySize=size take in consideration the other things previously mentioned.

Tell us what you can observe about this.

@Pill30
Copy link
Author

Pill30 commented Nov 11, 2024

Thanks for you response...
I did do some tests with changes to MaxDirectMemorySize but as far as I understand, the default is 0 which means the JVM can automatically choose the size (with no upper limit).
For my actual loadtest, when I picked a value (e.g. 20GB MaxDirectMemorySize and just 8G Heap), when total Memory passed 28GB (i.e. the sum of heap and MaxDirectMemorySize), a huge number of GC started occurring consuming 100 % CPU and started impacting the test.

Looks like I'll have to do at lot more investigating and testing if we are to persevere.
Thanks again.

Sample jmx for reference:

Mem-Test2.jmx.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants