|
|
Message-ID: <87ik9wzumr.fsf@gmail.com>
Date: Sat, 11 Apr 2026 21:10:20 -0700
From: Collin Funk <collin.funk1@...il.com>
To: oss-security@...ts.openwall.com
Cc: Vahagn Vardanian <vahagn@...rays.io>, Paul Eggert <eggert@...ucla.edu>
Subject: Re: GNU tar: listing/extraction desynchronization
allows hidden file injection
Solar Designer <solar@...nwall.com> writes:
>> === BUG #03 (HIGH) ===
>> Signed integer overflow in PAX v0.0/v0.1 sparse offset+numbytes
>>
>> Files: src/xheader.c:1426 (sparse_numbytes_decoder)
>> src/xheader.c:1480 (sparse_map_decoder)
>> Impact: Signed 64-bit integer overflow in offset+numbytes arithmetic.
>> Undefined behavior that can corrupt internal state.
>>
>> Description:
>>
>> The PAX v1.0 decoder in sparse.c already checks:
>>
>> if (INT_ADD_OVERFLOW (sp.offset, u))
>> ...
>>
>> But the PAX v0.0 and v0.1 decoders in xheader.c do not. A crafted
>> extended header with offset=9223372036854775800 and numbytes=100
>> passes individual range checks but overflows signed off_t when summed.
>>
>> UBSan output:
>>
>> xheader.c:1473: runtime error: signed integer overflow:
>> 9223372036854775800 + 100 cannot be represented in type 'long int'
>>
>> Proposed fix:
>>
>> Add the same INT_ADD_OVERFLOW check after assigning numbytes in both
>> sparse_numbytes_decoder() and sparse_map_decoder(), and return early
>> with an error if the check fires.
Both of the functions they reference check that the value fits in off_t.
A commit from 16 years ago shows the relevant code [1], but even before
that overflows where checked.
>> === BUG #04 (LOW) ===
>> FLEXNSIZEOF overflow in create_placeholder_file
>>
>> File: src/extract.c:1444
>> Impact: Theoretical heap buffer overflow if link_name is near SIZE_MAX.
>>
>> Description:
>>
>> create_placeholder_file() computes:
>>
>> xmalloc (FLEXNSIZEOF (struct delayed_link, target,
>> strlen (current_stat_info.link_name) + 1));
>>
>> If strlen(link_name) is near SIZE_MAX, adding 1 overflows size_t, and
>> FLEXNSIZEOF computes a small allocation size. The subsequent strcpy
>> writes past the buffer. PAX extended headers allow arbitrarily long
>> linkpath values, so the attacker controls the length. In practice
>> this requires the system to have nearly SIZE_MAX bytes of memory
>> available, so it is mostly theoretical.
>>
>> Proposed fix:
>>
>> size_t link_len = strlen (current_stat_info.link_name);
>> if (SIZE_MAX - sizeof (struct delayed_link) <= link_len)
>> xalloc_die ();
I think it is safe to assume that no one has SIZE_MAX bytes of memory.
>> === BUG #09 (MEDIUM) ===
>> strcmp overread past the non-NUL-terminated magic field
>>
>> File: src/list.c:565, 632 (read_header, decode_header)
>> Impact: Buffer overread past the 6-byte magic field boundary.
>> Format misdetection with crafted headers.
>>
>> Description:
>>
>> strcmp (h->magic, TMAGIC) // line 565
>> strcmp (header->header.magic, TMAGIC) // line 632
>>
>> TMAGIC is "ustar\0" (6 bytes including NUL). The magic field in the
>> header struct is exactly 6 bytes (char magic[6]). If the field
>> contains "ustar\x01" (no NUL terminator), strcmp reads past byte 5
>> into the adjacent version[2] and uname[32] fields until it finds a NUL.
>>
>> ASan (with tight heap allocator) reports:
>>
>> ERROR: AddressSanitizer: heap-buffer-overflow
>> READ of size 1 ... in __strcmp_sse42
>>
>> Proposed fix:
>>
>> memcmp (h->magic, TMAGIC, sizeof TMAGIC)
This is just copied from Paul's commit in 2025 [2].
>> === BUG #10 (MEDIUM) ===
>> Incomplete destructor for delayed_link_table hash entries
>>
>> File: src/extract.c:1478 (create_placeholder_file)
>> Impact: Memory leak on hash collision/replacement.
>>
>> Description:
>>
>> hash_initialize (0, 0, dl_hash, dl_compare, free)
>>
>> The destructor callback is just free(), which frees the top-level
>> struct delayed_link but not its sub-allocations: the sources linked
>> list, cntx_name, acls_a_ptr, acls_d_ptr, and xattr_map. When a hash
>> collision causes an old entry to be evicted, these are leaked.
>>
>> Proposed fix:
>>
>> Write a proper destructor:
>>
>> static void
>> free_delayed_link (void *entry)
>> {
>> struct delayed_link *p = entry;
>> struct string_list *s, *next;
>> for (s = p->sources; s; s = next)
>> {
>> next = s->next;
>> free (s);
>> }
>> free (p->cntx_name);
>> free (p->acls_a_ptr);
>> free (p->acls_d_ptr);
>> xattr_map_free (&p->xattr_map);
>> free (p);
>> }
>>
>> And pass free_delayed_link instead of free to hash_initialize.
GNU tar never calls hash_free, and therefore will never call the free
function given to hash_initialize. It is explained in a comment and the
commit message [3]:
if (false)
{
/* There is little point to freeing, as we are about to exit,
and freeing is more likely to cause than cure trouble. */
hash_free (delayed_link_table);
delayed_link_table = NULL;
}
>> === BUG #13 (LOW) ===
>> Integer truncation in decode_record
>>
>> File: src/xheader.c:630 (decode_record)
>> Impact: Implicit narrowing from ptrdiff_t to int; UBSan
>> -fsanitize=implicit-conversion fires.
>>
>> Description:
>>
>> int len_len = len_lim - p; // line 630
>>
>> len_lim - p is ptrdiff_t. If the difference exceeds INT_MAX, the
>> implicit conversion to int wraps to a negative value. The value is
>> only used to format an error message, so the practical impact is
>> limited to a garbled diagnostic, but UBSan rightfully flags it.
>>
>> Proposed fix:
>>
>> int len_len = (int) (len_lim - p < 1000 ? len_lim - p : 1000);
>>
>> This clamps the value to a reasonable range for the error message.
This proposal was just copied from Paul's commit in 2024 [4].
>> === SUMMARY TABLE ===
>>
>> # Severity File Function / Line Short description
>> -- -------- ----------- --------------------------- --------------------------
>> 01 CRITICAL sparse.c:593 sparse_extract_file Archive stream desync -> file injection
>> 14 CRITICAL (same root cause as #01) Full RCE chain demo
>> 02 HIGH sparse.c:803 oldgnu_add_sparse Overlapping sparse regions -> corruption
>> 03 HIGH xheader.c sparse_numbytes/map_decoder Missing INT_ADD_OVERFLOW
>> 04 LOW extract.c:1444 create_placeholder_file FLEXNSIZEOF size_t overflow
>> 05 MEDIUM extract.c:566 delay_set_stat Memory leak on reuse path
>> 06 MEDIUM extract.c:996 apply_nonancestor_delayed.. Shallow xattr_map copy (UAF)
>> 07 MEDIUM extract.c:985 apply_nonancestor_delayed.. Uninitialized tar_stat_info
>> 08 MEDIUM extract.c:1898 apply_delayed_link Uninitialized tar_stat_info
>> 09 MEDIUM list.c:565 read_header / decode_header strcmp past magic field
>> 10 MEDIUM extract.c:1478 create_placeholder_file Incomplete hash destructor
>> 11 LOW extract.c:364 check_time Signed overflow in time diff
>> 12 LOW extract.c:953 apply_nonancestor_delayed.. file_name_len=0 underflow
>> 13 LOW xheader.c:630 decode_record ptrdiff_t -> int truncation
I didn't look much at the others since I am not very familiar with tar.
Hopefully Paul can quickly tell if they are bogus or not.
>> === REPRODUCTION ===
>>
>> All bugs were confirmed on tar 1.35 (commit 8e3c8fa17 from
>> https://git.savannah.gnu.org/cgit/tar.git). Bugs #01, #02, and #14
>> are reproducible without sanitizers on any 64-bit Linux system. The
>> remaining bugs require ASan, UBSan, or MSan to observe.
This commit hash does not exist, which does not inspire much confidence:
$ git log 8e3c8fa17
fatal: ambiguous argument '8e3c8fa17': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Collin
[1] https://git.savannah.gnu.org/cgit/tar.git/commit/?id=a59c819beb4886ee43f16dfd80ec1151fda1abe6
[2] https://git.savannah.gnu.org/cgit/tar.git/commit/?id=c11084bcc2d7d9976570a12263b81d2488066115
[3] https://git.savannah.gnu.org/cgit/tar.git/commit/?id=258d1c44e5ee7c58b28bf0000e9d737df6081885
[4] https://git.savannah.gnu.org/cgit/tar.git/commit/?id=d1e72a536f26188230a147d948b9057714fd0b6b
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.