Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 1 Nov 2021 17:27:53 +0000
From: Nicholas Boucher <>
Subject: Trojan Source Attacks

OSS Security teams,

We have identified an issue affecting all compilers and interpreters 
that support Unicode. We believe that the techniques described hereafter 
can be used to generate adversarial encodings of source code files that 
can be used to craft targeted attacks against source code that cannot be 
seen by human reviewers in rendered text. This is of concern to the open 
source community because, absent defenses, supply chain attacks can be 
imperceptibly mounted against the ecosystem.

This vulnerability has undergone a coordinated disclosure process that 
has concluded today. The security advisory can be found at

Multiple organizations will be releasing parallel security advisories, 
such as Rust's advisory at, Red Hat's 
advisory at 
<>, and 
GitHub's advisory at 

The attached paper describes an attack paradigm -- which we believe to 
be novel -- discovered by security researchers at the University of 
Cambridge. There are two techniques for attack, both of which exploit 
Unicode's high expressiveness to craft source code files for which 
rendered text displays divergent logic from the underlying encoded bytes 
seen by compilers.

The first and primary technique, which we dub the Trojan Source attack, 
uses Unicode Bidirectional (Bidi) control characters embedded in 
comments and string literals to produce visually deceptive source code 
files. This technique enables an adversary to encode constructs that 
visually appear to be comments or string literals but execute as code, 
or vice versa. Complete details, as well as recommended mitigations, can 
be found in the attachment 001 Trojan Source.pdf. This vulnerability is 
tracked under CVE-2021-42574.

The second technique, to which we refer as the homoglyph variant, uses 
homoglyphs (characters that render to the same glyph but are represented 
by different Unicode values) to define adversarial identifiers. In this 
technique, an adversary defines an identifier such as a function name 
that appears visually identical to a target function, but is defined 
using Unicode homoglyphs. This adversarial function then performs some 
malicious action, then optionally calls the original function it is 
impersonating. When defined in upstream dependencies such as open source 
software, these adversarial functions can be imported into downstream 
software and invoked without visual indication of malicious code. 
Complete details, as well as recommended mitigations, can also be found 
in the attachment 001 Trojan Source.pdf. This vulnerability is tracked 
under CVE-2021-42694.

Proofs-of-concept can be found at

We hope that this information proves useful in building and applying 
defenses where applicable.

Nicholas Boucher
University of Cambridge

Content of type "text/html" skipped

Download attachment "001 Trojan Source.pdf" of type "application/pdf" (737637 bytes)

Download attachment "OpenPGP_0x5662BCEC5F1D2BEA.asc" of type "application/pgp-keys" (3160 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (841 bytes)

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.