Skip to content

Buffer Size

JHaller27 edited this page Apr 18, 2019 · 1 revision

What buffer size should I choose?

After a not-so-careful analysis (i.e. 3 tests), it was found that the rate of record insertions (including everything from file reading to the actual call to database.collection.insert_many(list_of_records)) is inversely proportional to the nearly-square root of the buffer size.

Graphing the Data table below (mapping x = Buffer size and y = Insertion rate) results in the equation y=134.4x^(-0.693) (with r^2=0.9889).

Thus, the speed increases as the buffer size shrinks. The smallest possible buffer size is 1, thus this buffer size is recommended. It is important to note that this test was conducted on a localhost MongoDB instance: the increased computational cost from using an external MongoDB instance may possibly reduce the effect of the buffer size.

Data

Buffer size Records inserted Elapsed time [s] Est. insertion rate [rec/s]
1 1984 16.274 121.923
10 1100 33.244 33.0886
100 1000 199.939 5.00152
Clone this wiki locally