-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When executing data-compaction after changing the size of a large object with the same name, the size of the object is incorrect #1191
Comments
Is there any update on this? We are considering leofs for our production system, but this problem would possibly block the decision since maintenance task could possibly corrupt large file on the system. |
Sorry for the late reply. We'll survey this problem in our environment. |
I would like to share a simple report as below. As you can see, I cloud not reproduce the same problem. $ leofs-adm status
[System Confiuration]
-----------------------------------+----------
Item | Value
-----------------------------------+----------
Basic/Consistency level
-----------------------------------+----------
system version | 1.5.0
cluster Id | leofs_1
DC Id | dc_1
Total replicas | 1
number of successes of R | 1
number of successes of W | 1
number of successes of D | 1
number of rack-awareness replicas | 0
ring size | 2^128
-----------------------------------+----------
Multi DC replication settings
-----------------------------------+----------
[mdcr] max number of joinable DCs | 2
[mdcr] total replicas per a DC | 1
[mdcr] number of successes of R | 1
[mdcr] number of successes of W | 1
[mdcr] number of successes of D | 1
-----------------------------------+----------
Manager RING hash
-----------------------------------+----------
current ring-hash | 433fe365
previous ring-hash | 433fe365
-----------------------------------+----------
[State of Node(s)]
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
type | node | state | rack id | current ring | prev ring | updated at
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
S | [email protected] | running | | 433fe365 | 433fe365 | 2019-08-16 14:12:39 +0900
G | [email protected] | running | | 433fe365 | 433fe365 | 2019-08-16 14:12:40 +0900
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
$ s3cmd mb s3://test
Bucket 's3://test/' created
## -----------------------------------------------------------------------------
## 1st:
## -----------------------------------------------------------------------------
$ ls -l
total 8402048
-rw-r--r-- 1 yosukehara staff 4294967295 Aug 16 13:40 test_large_obj
$ s3cmd put test_large_obj s3://test/
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 1 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 111.22 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 2 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 114.77 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 3 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 117.42 MB/s done
...
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 273 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 109.85 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 274 of 274, 1023kB] [1 of 1]
1048575 of 1048575 100% in 0s 54.55 MB/s done
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 4194304K | 1543662016 | true | 274 | 590350fc232bb | 2019-08-16 14:14:15 +0900
$ leofs-adm du [email protected]
active number of objects: 1094
total number of objects: 1095
active size of objects: 4295142443
total size of objects: 4295142631
ratio of active size: 100.0%
last compaction start: ____-__-__ __:__:__
last compaction end: ____-__-__ __:__:__
## -----------------------------------------------------------------------------
## 2nd:
## -----------------------------------------------------------------------------
$ s3cmd put test_large_obj s3://test/
^[[AWARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 1 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 119.46 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 2 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 108.23 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 3 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 114.88 MB/s done
...
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 273 of 274, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 113.18 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 274 of 274, 1023kB] [1 of 1]
1048575 of 1048575 100% in 0s 67.50 MB/s done
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 4194304K | 1543662016 | true | 274 | 590351a2c9a29 | 2019-08-16 14:17:10 +0900
$ leofs-adm du [email protected]
active number of objects: 1094
total number of objects: 2190
active size of objects: 4295142443
total size of objects: 8590285262
ratio of active size: 50.0%
last compaction start: ____-__-__ __:__:__
last compaction end: ____-__-__ __:__:__
## -----------------------------------------------------------------------------
## 3rd:
## -----------------------------------------------------------------------------
$ ls -l
total 10487808
-rw-r--r-- 1 yosukehara staff 5368709120 Aug 16 14:03 test_large_obj
$ s3cmd put test_large_obj s3://test/
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 1 of 342, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 103.92 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 2 of 342, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 114.93 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 3 of 342, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 112.41 MB/s done
...
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 341 of 342, 15MB] [1 of 1]
15728640 of 15728640 100% in 0s 104.51 MB/s done
upload: 'test_large_obj' -> 's3://test/test_large_obj' [part 342 of 342, 5MB] [1 of 1]
5242880 of 5242880 100% in 0s 72.06 MB/s done
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 5242880K | 1605e929ac | true | 342 | 59035244ab8ad | 2019-08-16 14:20:00 +0900
$ leofs-adm du [email protected]
active number of objects: 1366
total number of objects: 3557
active size of objects: 5368927924
total size of objects: 13959213374
ratio of active size: 38.46%
last compaction start: ____-__-__ __:__:__
last compaction end: ____-__-__ __:__:__
## -----------------------------------------------------------------------------
## data-compaction
## -----------------------------------------------------------------------------
$ leofs-adm compact-start [email protected] all
OK
$ leofs-adm compact-status [email protected]
current status: idling
last compaction start: 2019-08-16 14:22:12 +0900
total targets: 8
# of pending targets: 8
# of ongoing targets: 0
# of out of targets : 0
$ leofs-adm du [email protected]
active number of objects: 1366
total number of objects: 1366
active size of objects: 5368927924
total size of objects: 5368927924
ratio of active size: 100.0%
last compaction start: 2019-08-16 14:22:23 +0900
last compaction end: 2019-08-16 14:22:28 +0900
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 1048576K | 1605e929ac | true | 342 | 59035244ab8ad | 2019-08-16 14:20:00 +0900
|
From your 3rd report size column value 5242880K change to 1048576K but checksum is not change after compaction. $ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 5242880K | 1605e929ac | true | 342 | 59035244ab8ad | 2019-08-16 14:20:00 +0900
## -----------------------------------------------------------------------------
## data-compaction
## -----------------------------------------------------------------------------
$ leofs-adm compact-start [email protected] all
OK
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 1048576K | 1605e929ac | true | 342 | 59035244ab8ad | 2019-08-16 14:20:00 +0900 |
The following is correct:
There is no issue, "not change" in our environment.
But it was reproduced in our environment too. In conclusion:When executing |
Just in case, I would like to share the result of data-compaction after overwriting an object of the same name and size. $ ./leofs-adm status
[System Confiuration]
-----------------------------------+----------
Item | Value
-----------------------------------+----------
Basic/Consistency level
-----------------------------------+----------
system version | 1.5.0
cluster Id | leofs_1
DC Id | dc_1
Total replicas | 1
number of successes of R | 1
number of successes of W | 1
number of successes of D | 1
number of rack-awareness replicas | 0
ring size | 2^128
-----------------------------------+----------
Multi DC replication settings
-----------------------------------+----------
[mdcr] max number of joinable DCs | 2
[mdcr] total replicas per a DC | 1
[mdcr] number of successes of R | 1
[mdcr] number of successes of W | 1
[mdcr] number of successes of D | 1
-----------------------------------+----------
Manager RING hash
-----------------------------------+----------
current ring-hash | 433fe365
previous ring-hash | 433fe365
-----------------------------------+----------
[State of Node(s)]
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
type | node | state | rack id | current ring | prev ring | updated at
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
S | [email protected] | running | | 433fe365 | 433fe365 | 2019-08-17 08:52:52 +0900
G | [email protected] | running | | 433fe365 | 433fe365 | 2019-08-17 08:52:54 +0900
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 4194304K | 1543662016 | true | 274 | 59044b487d36f | 2019-08-17 08:54:04 +0900
# after 2nd
$ leofs-adm du [email protected]
active number of objects: 1094
total number of objects: 2190
active size of objects: 4295142443
total size of objects: 8590285262
ratio of active size: 50.0%
last compaction start: ____-__-__ __:__:__
last compaction end: ____-__-__ __:__:__
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 4194304K | 1543662016 | true | 274 | 59044c03cb6f5 | 2019-08-17 08:57:20 +0900
# After data-compaction
$ leofs-adm compact-start [email protected] all
OK
$ leofs-adm du [email protected]
active number of objects: 1094
total number of objects: 1094
active size of objects: 4295142443
total size of objects: 4295142443
ratio of active size: 100.0%
last compaction start: 2019-08-17 08:59:09 +0900
last compaction end: 2019-08-17 08:59:12 +0900
$ leofs-adm whereis test/test_large_obj
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | has children | total chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
| [email protected] | 96c2599f40811964c2455a8acb9e13b4 | 4194304K | 1543662016 | true | 274 | 59044c03cb6f5 | 2019-08-17 08:57:20 +0900
|
Expected Behavior
After compaction object size must be the same as before compaction.
Actual Behavior
After compaction completed
Object size 4294967295 bytes not change
Object size 4294967296 bytes become 0 byte
Object size 5368709120 bytes become 1073741824 bytes
Steps to Reproduce the Problem
leofs-adm compact-start <node> all
The text was updated successfully, but these errors were encountered: