Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Apr 2024 20:19:46 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Joey Hess <id@...yh.name>, Solar Designer <solar@...nwall.com>
Cc: oss-security@...ts.openwall.com, Sam James <sam@...too.org>,
	Jonathan Nieder <jrnieder@...il.com>,
	Andres Freund <andres@...razel.de>,
	Lasse Collin <lasse.collin@...aani.org>, xz@...aani.org
Subject: Re: Analysis on who is Jia Tan, and who he could work for, reading
 xz.git

Hi Joey, Alexander,

On Wed, Apr 10, 2024 at 12:10:51PM -0400, Joey Hess wrote:
> Alejandro Colomar wrote:
> > I suspect those +0200 and +0300 correspond to a few times that this guy
> > would have traveled to his intelligence agency for some special work
> 
> That's a theory. But many of the commits with author Jia Tan in those
> time zones have committer Lasse Collin, and show signs of being eg,
> git-amed patch sets which may have also been rebased. In which case
> it would make sense that these have Lasse Collin's usual timezone.

Yep, I also had the feeling that some of those might be the result of
git-am(1) (TBH, I had those feelings today, after the email had been
sent).  In principle, git-am(1) respects the author date, but if some
mails (assuming patches taken via mail) were somehow malformed, or Lasse
had something misconfigured, it might have overwritten the author date.
Maybe this helps Lasse investigate his emails, and see if this makes any
sense for him.

The CommitDate, however, is certain to be from Jia, and there are
exactly 4 commits from him with a suspicious CommitDate (+0200), in a
short and recent period of time: 2024-02-29 - 2024-03-05, part of the
most "fun" period of this thing.  It would be interesting to investigate
the mails that led to those patches being pushed, if they were discussed
publicly, since they may contain IPs and other traces.

BTW, there's more indication that shows there were several people
involved: there's a mix of timezones in this period:

	$ git log --all --since=2024-02-28 --until=2024-03-06 \
		--pretty=fuller --date=iso \
	| grep -B1 Date: \
	| grep -A1 jia \
	| grep -v -- -- \
	| grep -v jia \
	| sed 's/......Date: //' \
	| while read d; do
		date -u --iso-8601=seconds --date="$d";
		echo "  %%  $d";
	done \
	| sed 'N;s/\n//' \
	| sort;
	2024-02-29T14:35:52+00:00  %%  2024-02-29 16:35:52 +0200
	2024-02-29T14:35:52+00:00  %%  2024-02-29 16:35:52 +0200
	2024-03-04T16:27:31+00:00  %%  2024-03-05 00:27:31 +0800
	2024-03-04T16:27:31+00:00  %%  2024-03-05 00:27:31 +0800
	2024-03-04T16:27:31+00:00  %%  2024-03-05 00:27:31 +0800
	2024-03-04T16:34:46+00:00  %%  2024-03-05 00:34:46 +0800
	2024-03-04T16:34:46+00:00  %%  2024-03-05 00:34:46 +0800
	2024-03-04T16:34:46+00:00  %%  2024-03-05 00:34:46 +0800
	2024-03-04T17:23:18+00:00  %%  2024-03-04 19:23:18 +0200
	2024-03-04T17:54:30+00:00  %%  2024-03-05 01:54:30 +0800
	2024-03-04T17:54:30+00:00  %%  2024-03-05 01:54:30 +0800
	2024-03-05T21:21:26+00:00  %%  2024-03-05 23:21:26 +0200

Of course, all of that can be faked, but it's a starting point.  And
even for state actors, it's hard to not make mistakes, so they likely
leaked something at some point.

> 
> I analized that here: https://hachyderm.io/@joeyh/112193146103113070

Regarding to your question in that post:

< anyone know of a common #git workflow that would result in 4 commits
< with 2 separate authors all having one timestamp as a common commit
< timestamp and a second timestamp as a common author timestamp?

For the author dates you get them with `git commit --reuse-message`.  I
do that seldom, but I do it.  It's useful when I decide I want to reuse
a commit message (for a patch set which has repetitively stuff in the
commit message, for example, where you can --reuse-message and then
adjust).  You can find a few examples in the Linux man-pages repo.
The committer date you can get them with a rebase of the patch set.

So he reused a commit message + ammend, and then rebased at some point.
It's not unconceivable.  Here's how to reproduce it:

	alx@...ian:~/tmp$ mkdir foo
	alx@...ian:~/tmp$ cd foo/
	alx@...ian:~/tmp/foo$ git init
	Initialized empty Git repository in /home/alx/tmp/foo/.git/
	alx@...ian:~/tmp/foo$ git commit --allow-empty -m init
	[main (root-commit) dd8f3ea] init
	alx@...ian:~/tmp/foo$ touch a
	alx@...ian:~/tmp/foo$ git add .
	alx@...ian:~/tmp/foo$ git commit -m a
	[main c92b1c1] a
	 1 file changed, 0 insertions(+), 0 deletions(-)
	 create mode 100644 a
	alx@...ian:~/tmp/foo$ touch b
	alx@...ian:~/tmp/foo$ git add .
	alx@...ian:~/tmp/foo$ git commit -m b
	[main 511aa3c] b
	 1 file changed, 0 insertions(+), 0 deletions(-)
	 create mode 100644 b
	alx@...ian:~/tmp/foo$ touch c
	alx@...ian:~/tmp/foo$ git add .
	alx@...ian:~/tmp/foo$ git commit --reuse-message=HEAD
	[main 28cd344] b
	 Date: Wed Apr 10 19:31:33 2024 +0200
	 1 file changed, 0 insertions(+), 0 deletions(-)
	 create mode 100644 c
	alx@...ian:~/tmp/foo$ git commit --amend -m c
	[main 94aa6a9] c
	 Date: Wed Apr 10 19:31:33 2024 +0200
	 1 file changed, 0 insertions(+), 0 deletions(-)
	 create mode 100644 c
	alx@...ian:~/tmp/foo$ git rebase -i HEAD^^^
	[detached HEAD 460b428] a
	 Date: Wed Apr 10 19:31:20 2024 +0200
	 1 file changed, 0 insertions(+), 0 deletions(-)
	 create mode 100644 a
	Successfully rebased and updated refs/heads/main.
	alx@...ian:~/tmp/foo$ git log --pretty=fuller
	commit 7b102f35902a5212114cd1ceb5ecf4e648c83abb (HEAD -> main)
	Author:     Alejandro Colomar <alx@...nel.org>
	AuthorDate: Wed Apr 10 19:31:33 2024 +0200
	Commit:     Alejandro Colomar <alx@...nel.org>
	CommitDate: Wed Apr 10 19:36:41 2024 +0200

	    c

	commit b28ec7f6a33eacd9dd27f6493493bc399ecff66e
	Author:     Alejandro Colomar <alx@...nel.org>
	AuthorDate: Wed Apr 10 19:31:33 2024 +0200
	Commit:     Alejandro Colomar <alx@...nel.org>
	CommitDate: Wed Apr 10 19:36:41 2024 +0200

	    b

	commit 460b42821313d48207760e79583d8fbd3f6fe3ec
	Author:     Alejandro Colomar <alx@...nel.org>
	AuthorDate: Wed Apr 10 19:31:20 2024 +0200
	Commit:     Alejandro Colomar <alx@...nel.org>
	CommitDate: Wed Apr 10 19:36:38 2024 +0200

	    a

	commit dd8f3ea6a3dc1272389e7ad5afd950b3194bdea8
	Author:     Alejandro Colomar <alx@...nel.org>
	AuthorDate: Wed Apr 10 19:30:59 2024 +0200
	Commit:     Alejandro Colomar <alx@...nel.org>
	CommitDate: Wed Apr 10 19:30:59 2024 +0200

	    init


Have a lovely day!
Alex


On Wed, Apr 10, 2024 at 06:28:13PM +0200, Solar Designer wrote:
> On Wed, Apr 10, 2024 at 05:16:52AM +0200, Alejandro Colomar wrote:
> > I've been researching xz.git to learn about this malicious actor, and
> > who he might have worked for.
>
> As a moderator, I reluctantly let this through out of respect for
> Alejandro's time and knowing that many readers will find it interesting.

Thank you.

> However:
>
> This is almost off-topic for oss-security and it risks provoking further
> speculation and potentially hatred in follow-ups.  Related analyses,
> including not only of timezones but also of commit times, were already
> posted elsewhere (e.g., a Wired story).  So let's please limit the
> follow-ups to (1) corrections of any factual errors or major omissions
> (to the extent of being misleading) there might be in Alejandro's
> postings and (2) observations that more directly help us identify or
> prevent more compromises like this (if any can be made based on this
> analysis, which I doubt).  One major omission I'd like to point out is
> that timezones can be faked - we have no reliable way to know which of
> these, if any, actually correspond to where Jia Tan was.
>
> Note that other recent threads in here about search for code patterns
> similar to Jia Tan's and even for PGP keys similar to Jia Tan's are more
> relevant to oss-security, because they're aimed to uncover potential
> related backdoor code in other projects.  In contrast, identifying who
> Jia Tan is or what country/ies they're from doesn't obviously help.  At
> best, it may give us guesses on where the presumed targets are, but then
> what?  We need to protect the whole ecosystem regardless of who/where
> the current attackers are, and we need to develop means to detect such
> attacks everywhere, not only at currently likely targets.


P.S.:  While the first part of this email is within "corrections of any
factual errors or major omissions", I acknowledge that this last part
might be getting even more off-topic.  Since I guess it's short and will
have no replies, I included it.  Sorry.

P.S.2:  I didn't find the other similar investigations in other sites
until today.  There's so much stuff about this that it's hard to find it
all.  Sorry for duplication.  Hopefully, this might contain some new
idea that might help someone.  Sorrt again.  :)

P.S.3:  I hope nobody takes this incident as an excuse to hate a group
of people.  This is a thing about states being evil, and there are
powerful states of all inclinations that do evil stuff.  And even if it
were just an individual, the same can be said of individuals.  I don't
intend this thread to be used for increasing hatred; instead I did it
for learning about how this has happened, and what kinds of mistakes and
patterns of mistakes can authors of this and similar attacks have
forgotten to check, which could be useful to detect similar attacks in
other projects, if similar git history checks are done in other repos.

> P.S. Let's also not spam distro security teams with this (CC's dropped).
> I'm sure they don't want tickets auto-created for such analyses, like
> they would for vulnerability reports.  And I certainly don't want to
> spend time removing more ticket auto-replies from our moderation queue.

Ok.


-- 
<https://www.alejandro-colomar.es/>

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.