You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I need to write a massive (2b rows per 24 hours) amount of data every day.
Application does 1m requests per city. (4 cities)
Evey request gets 500 rows of info.
Every 4 requests I write rows to clickhouse db using clickhouse-connect -> 2k products per write
Got 8 celery processes, each does 125k requests part of all
Each process does 4 requests async.gather -> gets 4 lists of 500 rows, combines them to 2k list
There goes client.insert
Every 10 minutes SWAP memory increases by 50mb, BUT Database swap usage stays on same level always (70MB)
smem shows, that celery processes grow in SWAP memory.
I was worried that aiohttp leaks, but removing insert and doing just requests does not make memory grow at all
so the problem is insert
Steps to reproduce
Run process, that writes something with AsyncClient.insert in a loop
Wait
See swap memory growth
Expected behaviour
No swap memory growth
Seems to me, that client does not clear QueryResult or QueryResultSummary properly, but I am not smart enough to figure it out.
Code example
made custom async context manager for getting connect
I've not seen evidence of memory leaks in some of our long running tests, so It's difficult to debug this outside of your environment. Accordingly, I'd ask you to try a couple of things to help narrow down the issue.
First, could you try a version with just the regular client? I assume it would have lower throughput but it would help eliminate the async handling as the issue.
Describe the bug
I need to write a massive (2b rows per 24 hours) amount of data every day.
Application does 1m requests per city. (4 cities)
Evey request gets 500 rows of info.
Every 4 requests I write rows to clickhouse db using clickhouse-connect -> 2k products per write
Got 8 celery processes, each does 125k requests part of all
Each process does 4 requests async.gather -> gets 4 lists of 500 rows, combines them to 2k list
There goes client.insert
Every 10 minutes SWAP memory increases by 50mb, BUT Database swap usage stays on same level always (70MB)
smem shows, that celery processes grow in SWAP memory.
I was worried that aiohttp leaks, but removing insert and doing just requests does not make memory grow at all
so the problem is insert
Steps to reproduce
Expected behaviour
No swap memory growth
Seems to me, that client does not clear QueryResult or QueryResultSummary properly, but I am not smart enough to figure it out.
Code example
made custom async context manager for getting connect
this is request function
and this is saving to db function:
yes, I tried to do
made it worse
Configuration
Environment
ClickHouse server
CREATE TABLE
statements for tables involved:The text was updated successfully, but these errors were encountered: