home   sections   references   cd:s   about   links   heptagon 
 margins   view as black text on white background 


Polluting Kazaa and WinMx

Last December as a result of "Elements part 1" being leaked to the net, I did some poking around into the innards of the two "worst" p2p clients.

I managed to find a consistent method to spoof the WinMX and Kazaa networks and to inject "fake" files into these networks that are indistinguishable to real files that are already out there. This is a much more efficient method at polluting the file pool. The "naive" version (that major labels are already using) uses fake files that have been renamed so they get found and downloaded in vain. This method requires a lot of copies, bandwidth and other resources, which the labels have accomplished by setting up "farms" of computers. Lists of these "fake farm" netblocks are even circulated, and savvy downloaders block them.

If you want to skip the boring explanation and get the tool, download it from here.

Anyway, as the "real" files multiply and get shared, they eventually "drown out" any attempts by a label to pollute the file pool based on the file names alone.

My method is much more insidious and (dare I say) clever. It is not based on mere file names, it is based on the unique hash values that the clients use to identify every file on the network. It utilizes files that are already out there, and moreover because the most popular files are the easiest to find, it takes care of the most "dangerous" ones first. It does require a little bit of manual intervention in the initial stages — you have to download the files and patch them. But the payoff is quite great.

Here is the background. I will use winmx as the example because that's the first network that I attacked. The Kazaa details are very similar.

Each unique file on the WinMX network is identified by its hash value (512 bits; it usually expressed as a 32-character hex string.) The method in which WinMX calculates this hash has a serious flaw: only part of the file is hashed, making it possible to "swap" the sections that are not hashed with whatever you want.

Example:

Let's say M.A.Numminen-Kiusankappaleita ALBW.mp3 is 85_443_223 bytes long. Its hash is calculated like this:

hash calculated using byte ranges:
0 to 131072
8519680 to 8650752
17039360 to 17170432
25559040 to 25690112
34078720 to 34209792
42598400 to 42729472
51118080 to 51249152
59637760 to 59768832
68157440 to 68288512
76677120 to 76808192
85196800 to 85327872

byte ranges which do not affect hash value:
131073 to 8519679
8650753 to 17039359
17170433 to 25559039
25690113 to 34078719
34209793 to 42598399
42729473 to 51118079
51249153 to 59637759
59768833 to 68157439
68288513 to 76677119
76808193 to 85196799
85327873 to 85443222


bytes hashed: 1441792 (1.69% of total length)
bytes not hashed: 84001408

Thus in this case, you can replace 98.31% of the file with crap, without Winmx noticing.

Here is a perl function which will return the start positions of the 131072-byte long blocks that are used to compute the hash:

sub mxmap {
my $length = $_[0];
my $N = int($length / 655360);
my @hashmap = (0);
if ($N < 12) {
foreach my $i (0 .. $N) {push(@hashmap, $i * 655360); }
} else {
my $M = int($length / 1310720);
foreach my $i (0 .. 11) {push(@hashmap, $M * $i * 131072); }
}
@hashmap;
}

That is, &mxmap(85443223) will return (0, 8519680, 17039360 ... 85196800)

All you have to do to insert a "fake" file into the WinMX network is to modify a popular "real" one, and modify it in the positions from which the hash is not calculated. You then share that file and turn on WinMX, exposing it to the outside network. When someone searches for the "real" file, they will download your "fake" file to their machine. To the network, it will be completely indistinguishable from the "real" one, so someone else can in their turn download it from that machine, believing it to be the "real" one. I have some evidence that this is happening a lot, as many people simply download (and subsequently share) stuff just in order to collect it — that is, they don't listen to the file and don't notice that it is broken. Thus it can spread very quickly, because you enlist unwitting collectors and you use their bandwidth to help you spread your fakes.

This method becomes even more effective due to the fact that these networks use something called "multi source download". This is partly to get around the problem that most subscriber high-speed (and also low speed) internet connections have assymetric bandwidths — that is, for instance, an ADSL line could very well have a 512 kbps downloading rate but only 128 kbps available for uploading, or a 56 kbps modem only has an upload data rate of 28 kbps. Multi source download is a method that downloads a desired file from many sources at once, in an effort to alleviate this.

Of course, the only way for the program to identify a file on a remote computer is by its "unique" hash value. Thus it means that even if someone has a file which is 90% complete and uncorrupted, he/she has to start all over from the beginning if he/she tries to aquire the missing 10% and happens to stumble on one of the computers that host one of the patched files...

We went on war footing around december 2003, using something like 7 friends' computers (behind cable modems or dsl connections) around the world. Even though we were a little late, I would say the results were quite spectacular considering the cost to mount the attack. :)

I developed a suite of perl scripts to handle automated monitoring of the downloads directory and automatic "patching" with garbage mp3 files of my choice, duplicate lookup and housekeeping, making highly compressed distribution packages to send to my fellow network warriors' computers, etc. Contact me if you are interested in seeing them (it's pretty rough code but it works great)

There is another WinMX vulnerability (Kazaa doesn't work this way) in that "incompletely downloaded" files are not hashed at all, until they are completed. I haven't exploited this yet, but it is trivial to do. "Stealing candy from a child" is the phrase that comes to mind.......

You can place whatever you want in these files, and expose them to the WinMX network. WinMX trusts the filenames of the incomplete file cache blindly, including the hash value!!!! The names take the following form (from these parts, concatenated):

The string "__INCOMPLETE___"
The original filename
32 ascii-hex bytes WinMX hash
8 ascii-hex length
4 ascii-hex bitrate
4 ? ('0000')
4 ascii-hex sample rate
4 ? ('0000')
4 ascii-hex length in seconds
tre string "."
The original extension
Example (NB, I have broken the lines for readability):
__INCOMPLETE___M.A.Numminen—Perkele
0123456789abcdeffedcba9876543210
0517c297
00c00000
ac440000
01ff
.mp3


Page updated Apr 10, 2004 at 03:52 • Email: jens@panix.com

All content copyright © Jens Johansson 2024. No unathorized duplication, copying, mirroring, pilfering, pilllaging, archival, or redistribution/retransmission allowed! Any offensively categorical statements passed off as facts herein should only be construed as my very opinionated opinions.