Fix FileSystem closed after pr 23 #54

carrey-feng · 2021-12-23T12:48:00Z

Some queries fail cause by FileSystem closed exception when I used jmeter for load test.
My Environment：
presto ： 0.265.1
presto-hadoop-apache2: 2.7.4-9
hadoop (with rbf and keberos) : 3.2.1
hive: 1.2.1
jmeter(use 100 threads to run ): 5.3

Query Error Infomation：

I suspect this problem is caused by the privatecredentials of the PrestoFileSystemCache class. So I add some logs in fileSystemRefresh code block of PrestoFileSystemCache.getInternal() , like this:

At the same time, add a log before close filesystem.

Run jmeter script again, I get these logs

As can be seen from the above logs. A total of 3 FileSystem created in one second, include two time FileSystemRefresh. Finally, the two filesystems are closed in the following one second. This leads to FileSystem closed exception. One second is so short, the filesystem that is closed is using by queries when it is closed.

Multiple filesystem refreshes occur because the newly acquired private credentials are always more than those in the cached filesystemholder at the beginning. Therefore, when determining whether to need filesystem refresh, we should replace equals() with containsAll(), like this

Run jmeter script again, just a filesystem is created after equals() with containsAll(). There is no any filesystem closed exception. But I think the problem has not been completely solved. Because FinalizerService make the time to close the filesystem uncontrollable. The FileSystem may be closed immediately cause by jvm gc after fileSystemHolder is removed from cache map, then FileSystem closed exception will occur again.

So, I think that the FileSystem should be delay closed after a configuratable time. I add a config presto.hdfs.expired.fs.delay.close.time to crontrol how long it takes to close the FileSystem after FileSystem is removed from PrestoFileSystemCache.map. the default value is 300000 ms(5 minutes). The log show 2 of 3 FileSystem are closed afiter 5 minutes.

linux-foundation-easycla · 2021-12-23T12:48:03Z

❌ The commit (e05073d) is missing the User's ID, preventing the EasyCLA check. Consult GitHub Help to resolve.For further assistance with EasyCLA, please submit a support request ticket.

Fix FileSystem closed after pr 23

e05073d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix FileSystem closed after pr 23 #54

Fix FileSystem closed after pr 23 #54

carrey-feng commented Dec 23, 2021 •

edited

Loading

linux-foundation-easycla bot commented Dec 23, 2021

Fix FileSystem closed after pr 23 #54

Are you sure you want to change the base?

Fix FileSystem closed after pr 23 #54

Conversation

carrey-feng commented Dec 23, 2021 • edited Loading

linux-foundation-easycla bot commented Dec 23, 2021

carrey-feng commented Dec 23, 2021 •

edited

Loading