Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java EMA/ETA ReactorFactory possible memory leak #290

Open
xanddol opened this issue Oct 23, 2024 · 3 comments
Open

Java EMA/ETA ReactorFactory possible memory leak #290

xanddol opened this issue Oct 23, 2024 · 3 comments

Comments

@xanddol
Copy link

xanddol commented Oct 23, 2024

Hello everyone!

We are using the EMA client library version rtsdk.version = 3.8.0.0 in our application along with the Spring Boot framework, and the application is further packaged in Docker.
During operation, we discovered that over time, the application starts to consume more and more memory, which eventually leads to Out of memory errors and application restarts in kubernetes.
The basic scenario for working with the application is as follows:

1)User logs in to the application
2)User subscribes to several hundred instruments
3)User receives updates on these instruments in streaming mode for a certain period, for example, for several minutes or hours
4)Then the user logs out
5)Several hundred such users work simultaneously
6)For each user, an OmmConsumer is created upon login, and consumer.uninitialize() is called after the work is finished
7)After the operation, we performed a full GC and took a heap dump. It was discovered that some objects are still in the heap. See picture

Analysis and debugging tell us that the class com.refinitiv.eta.valueadd.reactor.ReactorFactory contains many static fields like static VaPool _wlStreamPool = new VaPool(true) or static VaPool _wlRequestPool = new VaPool(true), which easily survive Java GC. At the same time (at the picture), in the heap dump, we see that for each user, a WatchList object was created, which, when calling consumer.uninitialize(), ends up in _watchlistPool here: com.refinitiv.eta.valueadd.reactor.Watchlist#returnToPool. There can be several hundred such WatchList objects, each occupying 2Mb of heap memory. It turns out that after the end of the trading day, there are no connected users anymore, but objects remain in memory. Over time, the objects in com.refinitiv.eta.valueadd.reactor.ReactorFactory will increase in size and occupy > 1Gb of heap, which will eventually lead to an Out of memory error for the application.

The problem is reproducible in the latest version rtsdk.version = 3.8.2.0 as well.
pic2

@ViktorYelizarov
Copy link
Contributor

@xanddol Thank you for bringing this issue to our attention! We created an internal Jira to investigate it.

@ViktorYelizarov
Copy link
Contributor

@xanddol
After investigation we found that the described scenario may result in increased memory usage over time to meet peak demand due to many concurrent users and/or watchlist sizes per user. The root cause of it is that we store all created watchlist objects in global pool for possible future reuse to minimize GC influence on latency. The problem you are encountering is that the pool size increases to meet peak demand with no mechanism to decrease it.

Proposal: We can provide a configurable and dynamically alterable limit beyond which objects will not be pooled.

As a workaround (which you may already be doing), we suggest to increase memory heap size to account for your peak activity. Note that the 'out of memory' issue must be addressed by allocating adequate memory to handle your expected peak usage.
Or is there a different concern to explore (under what circumstances are you running out of memory?

@xanddol
Copy link
Author

xanddol commented Oct 29, 2024

@ViktorYelizarov

Proposal: We can provide a configurable and dynamically alterable limit beyond which objects will not be pooled.

<<Your proposal for a configurable dynamically changing limit beyond which objects will not be pooled is suitable for us. Moreover, it looks more configurable, given that each user of the library can decide for himself what size the pool itself will be.

As a workaround (which you may already be doing), we suggest to increase memory heap size to account for your peak activity. Note that the 'out of memory' issue must be addressed by allocating adequate memory to handle your expected peak usage. Or is there a different concern to explore (under what circumstances are you running out of memory?

<<Right, we have already increased the limits of memory heap size as workaround. Our concern only is that but every day we may have hundreds or thousands of users (potentially tens of thousands), so the maximum allocated memory will not be used from time to time, which is costly from a business point of view. So we will look forward to a permanent solution described in proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants