As mentioned in the previous commit, "safe links" is on by default, meaning that any symlinks pointing to outside of archived files (or invalid) will not be stored. To store such symlinks, "--no-safe-links" must be specified. This commit implements "safe links" for v1 of the file format.
8.8 KiB
File Format
Note that any unused bytes/bits should be zeroed-out before being written.
Format Version 0
File extension is "*.simplearchive"
First 18 bytes of file will be:
SIMPLE_ARCHIVE_VER
Next 2 bytes is 16-bit unsigned integer "version" in big-endian. In this case, it will be zero.
Next 4 bytes are bit-flags.
- The first byte
- The first bit is set if de/compressor is set for this archive.
The remaining unused flags are reserved for future revisions and are currently ignored.
If the previous "de/compressor is set" flag is enabled, then the next section is added:
- 2 bytes is 16-bit unsigned integer "compressor cmd+args" in big-endian. This does not include the NULL at the end of the string.
- X bytes of "compressor cmd+args" (length defined by previous value). Is a NULL-terminated string.
- 2 bytes is 16-bit unsigned integer "decompressor cmd+args" in big-endian. This does not include the NULL at the end of the string.
- X bytes of "decompressor cmd+args" (length defined by previous value). Is a NULL-terminated string.
The next 4 bytes is 32-bit unsigned integer "file count" in big-endian which will indicate the number of files in this archive.
Following the file-count bytes, the following bytes are added for each file:
- 2 bytes is 16-bit unsigned integer "filename length" in big-endian. This does not include the NULL at the end of the string.
- X bytes of filename (length defined by previous value). Is a NULL-terminated string.
- 4 bytes bit-flags
- The first byte
- The first bit is set if the file is a symbolic link.
- The second bit is "user read permission".
- The third bit is "user write permission".
- The fourth bit is "user execute permission".
- The fifth bit is "group read permission".
- The sixth bit is "group write permission".
- The seventh bit is "group execute permission".
- The eighth bit is "other read permission".
- The second byte.
- The first bit is "other write permission".
- The second bit is "other execute permission".
- The third bit is UNSET if relative links are preferred, and is SET if absolute links are preferred.
- The fourth bit is set if this file/symlink-entry is invalid and must be skipped. Ignore following bytes after these 4 bytes bit-flags in this specification and skip to the next entry; if marked invalid, the following specification bytes for this file/symlink entry must not exist.
- The third byte.
- Currently unused.
- The fourth byte.
- Currently unused.
- The first byte
- If this file is a symbolic link:
- 2 bytes is 16-bit unsigned integer "link target absolute path" in big-endian. This does not include the NULL at the end of the string.
- X bytes of link-target-absolute-path (length defined by previous value). Is a NULL-terminated string. If the previous "size" value is 0, then this entry does not exist and should be skipped.
- 2 bytes is 16-bit unsigned integer "link target relative path" in big-endian. This does not include the NULL at the end of the string.
- X bytes of link-target-relative-path (length defined by previous value). Is a NULL-terminated string. If the previous "size" value is 0, then this entry does not exist and should be skipped.
- If this file is NOT a symbolic link:
- 8 bytes 64-bit unsigned integer "size of filename in this archive file" in big-endian.
- X bytes file data (length defined by previous value).
Format Version 1
File extension is "*.simplearchive" but this isn't really checked.
First 18 bytes of file will be (in ascii):
SIMPLE_ARCHIVE_VER
Next 2 bytes is a 16-bit unsigned integer "version" in big-endian. It will be:
0x00 0x01
Next 4 bytes are bit-flags.
- The first byte
- The first bit is set if de/compressor is set for this archive.
The remaining unused flags in the previous bit-flags bytes are reserved for future revisions and are currently ignored.
If the previous "de/compressor is set" flag is enabled, then the next section is added:
- 2 bytes is 16-bit unsigned integer "compressor cmd+args" in big-endian. This does not include the NULL at the end of the string.
- X bytes of "compressor cmd+args" (length defined by previous value). Is a NULL-terminated string.
- 2 bytes is 16-bit unsigned integer "decompressor cmd+args" in big-endian. This does not include the NULL at the end of the string.
- X bytes of "decompressor cmd+args" (length defined by previous value). Is a NULL-terminated string.
The next 4 bytes is a 32-bit unsigned integer "link count" in big-endian which will indicate the number of symbolic links in this archive.
Following the link-count bytes, the following bytes are added for each symlink:
- 2 bytes bit-flags:
- The first byte.
- The first bit is UNSET if relative links are preferred, and is SET if absolute links are preferred.
- The second bit is "user read permission".
- The third bit is "user write permission".
- The fourth bit is "user execute permission".
- The fifth bit is "group read permission".
- The sixth bit is "group write permission".
- The seventh bit is "group execute permission".
- The eighth bit is "other read permission".
- The second byte.
- The first bit is "other write permission".
- The second bit is "other execute permission".
- If this bit is set, then this entry is marked invalid. The link name will be preserved in this entry, but the following link target paths will be set to zero-length and will not be stored.
- The first byte.
- 2 bytes 16-bit unsigned integer "link name" in big-endian. This does not include the NULL at the end of the string. Must not be zero.
- X bytes of link-name (length defined by previous value). Is a NULL-terminated string.
- 2 bytes is 16-bit unsigned integer "link target absolute path" in big-endian. This does not include the NULL at the end of the string.
- X bytes of link-target-absolute-path (length defined by previous value). Is a NULL-terminated string. If the previous "size" value is 0, then this entry does not exist and should be skipped.
- 2 bytes is 16-bit unsigned integer "link target relative path" in big-endian. This does not include the NULL at the end of the string.
- X bytes of link-target-relative-path (length defined by previous value). Is a NULL-terminated string. If the previous "size" value is 0, then this entry does not exist and should be skipped.
After the symlink related data, the next 4 bytes is a 32-bit unsigned integer "chunk count" in big-endian which will indicate the number of chunks in this archive.
Following the chunk-count bytes, the following bytes are added for each chunk:
- 4 bytes that are a 32-bit unsigned integer "file count" in big-endian.
The following bytes are added for each file within the current chunk:
- 2 bytes that are a 16-bit unsigned integer "filename length" in big-endian. This does not include the NULL at the end of the string.
- X bytes of filename (length defined by previous value). Is a NULL-terminated string.
- 4 bytes bit-flags.
- The first byte.
- The first bit is "user read permission".
- The second bit is "user write permission".
- The third bit is "user execute permission".
- The fourth bit is "group read permission".
- The fifth bit is "group write permission".
- The sixth bit is "group execute permission".
- The seventh bit is "other read permission".
- The eighth bit is "other write permission".
- The second byte.
- The first bit is "other execute permission".
- The third byte.
- Currently unused.
- The fourth byte.
- Currently unused.
- The first byte.
- Two 4-byte unsigned integers in big-endian for UID and GID.
- A 32-bit unsigned integer in big endian that specifies the UID of the file. Note that during extraction, if the user is not root, then this value will be ignored.
- A 32-bit unsigned integer in big endian that specifies the GID of the file. Note that during extraction, if the user is not root, then this value will be ignored.
- A 64-bit unsigned integer in big endian for the "size of file".
After the files' metadata are the current chunk's data:
- A 64-bit unsigned integer in big endian for the "size of chunk".
- X bytes of data for the current chunk of the previously specified size. If not using de/compressor, this section is the previously mentioned files concatenated with each other. If using de/compressor, this section is the previously mentioned files concatenated and compressed into a single blob of data.