Create file format for format version 1
All checks were successful
Run Unit Tests / build-and-run-unit-tests (push) Successful in 5s
All checks were successful
Run Unit Tests / build-and-run-unit-tests (push) Successful in 5s
This is in preparation of improving compression by concatenating files together before compressing them to reduce the per-file overhead. Discussed in #18
This commit is contained in:
parent
70415c6caf
commit
9128fc9aa7
1 changed files with 102 additions and 0 deletions
102
file_format.md
102
file_format.md
|
@ -76,3 +76,105 @@ Following the file-count bytes, the following bytes are added for each file:
|
|||
1. 8 bytes 64-bit unsigned integer "size of filename in this archive file"
|
||||
in big-endian.
|
||||
2. X bytes file data (length defined by previous value).
|
||||
|
||||
## Format Version 1
|
||||
|
||||
File extension is "*.simplearchive" but this isn't really checked.
|
||||
|
||||
First 18 bytes of file will be (in ascii):
|
||||
|
||||
SIMPLE_ARCHIVE_VER
|
||||
|
||||
Next 2 bites is a 16-bit unsigned integer "version" in big-endian. It will be:
|
||||
|
||||
0x00 0x01
|
||||
|
||||
Next 4 bytes are bit-flags.
|
||||
|
||||
1. The first byte
|
||||
1. The first bit is set if de/compressor is set for this archive.
|
||||
|
||||
The remaining unused flags in the previous bit-flags bytes are reserved for
|
||||
future revisions and are currently ignored.
|
||||
|
||||
If the previous "de/compressor is set" flag is enabled, then the next section is
|
||||
added:
|
||||
|
||||
1. 2 bytes is 16-bit unsigned integer "compressor cmd+args" in big-endian. This
|
||||
does not include the NULL at the end of the string.
|
||||
2. X bytes of "compressor cmd+args" (length defined by previous value). Is a
|
||||
NULL-terminated string.
|
||||
3. 2 bytes is 16-bit unsigned integer "decompressor cmd+args" in big-endian.
|
||||
This does not include the NULL at the end of the string.
|
||||
4. X bytes of "decompressor cmd+args" (length defined by previous value). Is a
|
||||
NULL-terminated string.
|
||||
|
||||
The next 4 bytes is a 32-bit unsigned integer "link count" in big-endian which
|
||||
will indicate the number of symbolic links in this archive.
|
||||
|
||||
Following the link-count bytes, the following bytes are added for each symlink:
|
||||
|
||||
1. 2 bytes bit-flags:
|
||||
1. The first byte.
|
||||
1. The first bit is UNSET if relative links are preferred, and is SET if
|
||||
absolute links are preferred.
|
||||
2. The second byte.
|
||||
1. Currently unused.
|
||||
2. 2 bytes is 16-bit unsigned integer "link target absolute path" in
|
||||
big-endian. This does not include the NULL at the end of the string.
|
||||
3. X bytes of link-target-absolute-path (length defined by previous value).
|
||||
Is a NULL-terminated string. If the previous "size" value is 0, then
|
||||
this entry does not exist and should be skipped.
|
||||
4. 2 bytes is 16-bit unsigned integer "link target relative path" in
|
||||
big-endian. This does not include the NULL at the end of the string.
|
||||
5. X bytes of link-target-relative-path (length defined by previous value).
|
||||
Is a NULL-terminated string. If the previous "size" value is 0, then
|
||||
this entry does not exist and should be skipped.
|
||||
|
||||
After the symlink related data, the next 4 bytes is a 32-bit unsigned integer
|
||||
"chunk count" in big-endian which will indicate the number of chunks in this
|
||||
archive.
|
||||
|
||||
Following the chunk-count bytes, the following bytes are added for each chunk:
|
||||
|
||||
1. 2 bytes that are a 16-bit unsigned integer "file count" in big-endian.
|
||||
|
||||
The following bytes are added for each file within each chunk:
|
||||
|
||||
1. 2 bytes that are a 16-bit unsigned integer "filename length" in big-endian.
|
||||
This does not include the NULL at the end of the string.
|
||||
2. X bytes of filename (length defined by previous value). Is a NULL-terminated
|
||||
string.
|
||||
3. 4 bytes bit-flags.
|
||||
1. The first byte.
|
||||
1. The first bit is "user read permission".
|
||||
2. The second bit is "user write permission".
|
||||
3. The third bit is "user execute permission".
|
||||
4. The fourth bit is "group read permission".
|
||||
5. The fifth bit is "group write permission".
|
||||
6. The sixth bit is "group execute permission".
|
||||
7. The seventh bit is "other read permission".
|
||||
8. The eighth bit is "other write permission".
|
||||
2. The second byte.
|
||||
1. The first bit is "other execute permission".
|
||||
3. The third byte.
|
||||
1. Currently unused.
|
||||
4. The fourth byte.
|
||||
1. Currently unused.
|
||||
4. Two 4-byte unsigned integers in big-endian for UID and GID.
|
||||
1. A 32-bit unsigned integer in big endian that specifies the UID of the
|
||||
file. Note that during extraction, if the user is not root, then this
|
||||
value will be ignored.
|
||||
2. A 32-bit unsigned integer in big endian that specifies the GID of the
|
||||
file. Note that during extraction, if the user is not root, then this
|
||||
value will be ignored.
|
||||
5. A 64-bit unsigned integer in big endian for the "size of file".
|
||||
|
||||
After the files' metadata are the current chunk's data:
|
||||
|
||||
1. A 64-bit unsigned integer in big endian for the "size of chunk".
|
||||
2. X bytes of data for the current chunk of the previously specified size. If
|
||||
not using de/compressor, this section is the previously mentioned files
|
||||
concatenated with each other. If using de/compressor, this section is the
|
||||
previously mentioned files concatenated and compressed into a single blob of
|
||||
data.
|
||||
|
|
Loading…
Reference in a new issue