Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 11 Jun 2024 16:09:22 +0200
From: Szabolcs Nagy <>
To: Stefan Jumarea <>
Subject: Re: [PATCH] mallocng: Add MTE support for Aarch64

* Stefan Jumarea <> [2024-06-10 15:36:25 +0300]:
> @@ -102,17 +107,30 @@ void free(void *p)
>  {
>  	if (!p) return;
> +#ifdef MEMTAG
> +	void *untagged = (void *)((uint64_t)p & ~MTE_TAG_MASK);
> +#else
> +	void *untagged = p;
> +#endif
> +
>  	struct meta *g = get_meta(p);
>  static inline struct meta *get_meta(const unsigned char *p)
>  {
>  	assert(!((uintptr_t)p & 15));
> -	int offset = *(const uint16_t *)(p - 2);
> -	int index = get_slot_index(p);
> -	if (p[-4]) {
> +#ifdef MEMTAG
> +	const unsigned char *untagged = (const unsigned char *)((uint64_t)p & ~MTE_TAG_MASK);
> +#else
> +	const unsigned char *untagged = p;
> +#endif

if free just drops the tag, then incorrectly tagged pointer
will not be detected. musl does some use-after-free checks,
so i dont know how important this is, but i'd check that the
passed pointer matches the memory tag (unless 0 sized and
that the tag is non-0) otherwise a forged pointer may cause

i don't see where you enable tagged pointer abi and checks
(prctl) or where you add PROT_MTE to the mmapped memory.
(i think we want the synchronous tag check option.)
note there is software that assumes page granularity for
memory protection e.g. does read access at p&-4096, and
there may software that assumes the top bits of a valid
pointer is 0 so unconditionally enabling tagging and tag
checks can be an issue.

another potential issue is the early ALIGN_UP of the malloc
size: this overflows malloc(-1) and i think changes

iirc i changed IB when i tried out mte with mallocng.

i would avoid excessive ifdefs in the code, e.g. by using
'p = untag(p);' and define untag appropriately in a header.
(this might as well do the tag checks when mte is enabled,
but might need special-casing 0 sized allocations.)

tagging large areas can be slow, just like 'dc zva,x0' in
aarch64 memset, there is 'dc gva,x0' that tags an entire
cacheline, alternatively it may be better to just rely on
page protection for large allocations to avoid tagging
overheads (don't even mmap them as PROT_MTE). i've never
actually benchmarked what makes sense in practice, i'd at
least put the hand written loop behind an abstraction.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.