forked from fdupoux/fsarchiver
-
Notifications
You must be signed in to change notification settings - Fork 0
/
FILEFORMAT
179 lines (167 loc) · 10 KB
/
FILEFORMAT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
=====================================================================
fsarchiver: Filesystem Archiver for Linux [http://www.fsarchiver.org]
=====================================================================
About the file format
---------------------
The file format is made of two sort of structures: headers and
data-blocks. The headers are all a dictionnary where the key is an
integer. That way we can add new things in headers in next versions
without having to break the file format.
Global archive layout
---------------------
1) All fsarchiver archives are made of one or multiple volumes. Each
volume starts with an FSA_MAGIC_VOLH header and terminates with a
FSA_MAGIC_VOLF header. These headers are used to determine that the
archive is the fsarchiver format, and which volume it is.
2) The beginning of the archive (which is always at the beginning of
the first volume) contains an archive header (FSA_MAGIC_MAIN)
where se store information such as the archive creation date. Then
we write all the filesystem information header (FSA_MAGIC_FSIN).
It stores things such as the filesystem type (ext3, reiserfs, ...),
and all its attributes (label, uuid, space used, ...). Some of these
attributes are specific to a particular filesystem. For instance, it
saves the compatibility flags of ext2/ext3/ext4 filesystem so that
fsarchiver can recreates a filesystem with the same properties.
3) Then the archive contains all the filesystems contents:
there is an header that marks the beginning of a filesystem data
(FSA_MAGIC_FSYB) and another header is used to mark its end
(FSA_MAGIC_DATF). The filesystem contents is a list of object
headers (FSA_MAGIC_OBJT) and data-blocks.
4) The archive splitting mechanism will never split the archive in the
middle of an archive or datablock. An in-memory structure called
s_writebuf has been created to store all the data which are written
to the archive. All the bytes that are part of an header / block
are first accumulated in a s_writebuf. When it's ready, it's passed
to the write-to-archive function in charge of the file splitting.
That way an s_writebuf is like an atomic item that cannot be splitted.
The consequence it that it's not possible to respect exactly the
volume size specified by the user, it will always be a bit smaller.
About regular files management
------------------------------
Creating a normal tar.gz file is like compressing a tar file. It
means if there is a corruption in the tar.gz file, we can't ungzip
it and then we can untar it. Just a single corruption in one byte
at the beginning of the file make it unreadable.
For this reason fsarchiver is working the other way. It's like
a gz.tar: we archive files that have already been compressed before.
It provides a better safety for your data: if one byte if corrupt,
we just ignore the current file and we can extract the next one.
There is a problem with that: in case there are a lot of small files,
like in the linux-sources-2.6, we compress each small files in a
single data block of only few KB, which produces a bad compression
ratio compared to compression of a big tar archive of all the small
files.
For that reason, all the small files are processed in a different
way in fsarchiver-0.4.4 and later: it creates a set of small files
that can fit in a single data block (256KB by default) and then
it compresses a block which is quite big compared to the size of
a single file.
There is a threshold which is used to know whether a regular file
will be considered as a small file (sharing a data block with other
small files) or if it will be considered as a normal/big one having
its own data blocks. This threshold depends of the size of a data
block, which depends itself on the algirithm choosen to compress
the data. In general, a data block is 256KB (it can be bigger)
and the threshold will be 256KB/4 = 64KB. Empty files (size=0) are
saved in a single regular file (as large files) and they have no
footer (there is no need to have an md5 checksum for empty files)
How files are stored in the archive
-----------------------------------
There are many sort of objects in a filesystem:
1) special files (symbolic links, hard links, directories, ...)
all these files just have one header in the archive. It contains
all their attributes (name, permissions, xattr, time stats, ...)
2) normal/big regular files (bigger than the threshold) [REGFILE ]
these files have their header first (FSA_MAGIC_OBJT) with the
file attributes, and then one or several data blocks (depending
on how big the file is), and then a file footer (FSA_MAGIC_FILF)
with the md5sum of the contents for files which are not empty.
The md5sum can't be written in the object header (before the
data blocks) because it would require to read the file twice:
first pass to compute the checksum, and a second pass to copy
the contents the data blocks. (think about a very large file,
say 5GB, which is written to an fsa archive which is split into
small volumes, say 100MB)
3) small regular files (smaller than the threshold) [REGFILEM]
small files are written to the archive when we have a full set of
small files, or at the end of the savefs/savedir operation.
during the savefs, all the small files are stored in a structure
called s_regmulti. When this structure is full, it's copied to
the archive and a new one is created. So we end up with the
following data in the archive: we first write one object header
(FSA_MAGIC_OBJT) for each small file in the s_regmulti. This
headers contains extra keys (not found in object-header for normal
files): DISKITEMKEY_MD5SUM, DISKITEMKEY_MULTIFILESCOUNT,
DISKITEMKEY_MULTIFILESOFFSET. It's the md5sum for the file (we
can compute it before it's written to disk because the files are
quite small, and it saves an extra FSA_MAGIC_FILF footer), the
number of small-files to expect in the current set (used at the
extraction to know what read), and the offset of the data
for the current file in the shrared data block. After the individual
small-files headers, we write a single shared data-block (which is
compressed and may be encrypted as any other data block). There
is no header/footer after the shared data lock in the archive.
About datablocks
----------------
Each data block which is written to the archive has its own header
(FSA_MAGIC_BLKH). This header stores block attributes: original
block size, compressed size, compression algorithm used,
encryption algorithm used, offset of the first byte in the file, ...
When both compression and encryption are used, the compression is
done first. It's more efficient to process that way because the
encryption algorithm will have less things to do because the
compressed block is smaller. When the compression makes a block
bigger than the original one, fsarchiver automatically ignores
the compressed version and keeps the uncompressed block.
About endianess
---------------
fsarchiver should be endianess safe. All the integers are converted
to little-endian before they are written to the disk. So it's
neutral on intel platforms, and it requires a conversion on all the
big-endian platforms. The only things that have to be converted are
the integers which are part of the headers. These conversions don't
pollute the source code since they are all converted in the functions
that manage the headers: write-header-to-disk and read-header-from-disk
functions. All the higer level functions in the code just have to
add a number to the header, they don't have to worry about endianess.
About headers
-------------
All the information which is not the files contents itself (meta-data)
are stored in a generic management system called headers. It's not just
a normal C structure which is written to disk. A specific object
oriented mechanism has been written for that. The headers are quite
high-level data structures that are reused a lot in the sources. They
provide endianess management, checksumming, and extensibility. Since
the data are stored as a dictinnary, where the keys are 16bit integers,
it will be possible to extend the headers in the future by adding new
keys to add new information in the next versions. These headers are
just implemented as s_dico in the program, and the magic and checksum
are added when we write it to the archive file.
Here is how an header is stored in an archive:
- 32bit magic string used to identify the type of header it is
- 32bit archive id (random number generated suring the savefs / savedir)
- 16bit filesystem id (which filesystem of the archive it belongs to)
- dictionary with its own internal data (length, 32bit checksum, ...)
The random 32bit archive id is used to make sure the program won't
be disturbed by nested archive in case the archive is corrupt. When an
archive is corrupt, we can skip data and search for the next header
using the known magic strings. In case we have nested archives, we
could find the magic string that belongs to a nested archive. It won't
be used since we also compare the random 32bit archive id, and the id
of the nested archive header will be different so the program will know
that it has to ignore that header and continue to search.
About checksumming
------------------
Almost everything in the archive is checksummed to make sure the program
will never silently restore corrupt data. The headers that contains all
the meta-data (file names, permissions, attributes, ...) are checksummed
using a 32bit fletcher32 checksum. The files contents are written in
the archive as blocks of few hundreds kilo-bytes. Each block is compressed
(except if the compression makes the block bigger) and it may also be
encrypted (if the user provided a password). Each block also has its own
32bit fletcher32 checksum. All the files where size>0 also have an
individual md5 checksum that makes sure the whole file is exactly the
same as the original one (the blocks are checksummed but it allows to
make sure we did not drop one of the block of a file for instance).
Because of the md5 checksum, we can be sure that the program is aware
of the corruption if it happens.