-
Notifications
You must be signed in to change notification settings - Fork 1
/
atom.xml
808 lines (594 loc) · 66.7 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[0xAA - Random notes on security]]></title>
<link href="http://antukh.com/atom.xml" rel="self"/>
<link href="http://antukh.com/"/>
<updated>2017-10-01T11:25:56+02:00</updated>
<id>http://antukh.com/</id>
<author>
<name><![CDATA[Authorized Attacker]]></name>
</author>
<generator uri="http://octopress.org/">Octopress</generator>
<entry>
<title type="html"><![CDATA[Lifting the Veil, or Dark Does NOT Always Mean Secure]]></title>
<link href="http://antukh.com/blog/2015/10/09/lifting-the-veil/"/>
<updated>2015-10-09T17:02:10+02:00</updated>
<id>http://antukh.com/blog/2015/10/09/lifting-the-veil</id>
<content type="html"><![CDATA[<p>This post can be treated as a continuation of previously published article of “Deanonymization made simple”. As mentioned, more than five hundred
of publicly gathered hidden services were misconfigured to disclose <em>/server-status</em> page. I’ve analyzed all of them, and the results looked quite
interesting to me to publish those.</p>
<p><em>I would like to thank my friends <a href="https://twitter.com/josephfcox">@josephfcox</a> and <a href="https://twitter.com/flexlibris">@flexlibris</a> for providing me with invites
to Riseup and making this article possible.</em></p>
<!--more-->
<p>In my previous post the main target of such attacks were hidden services themselves, or servers. It sounds obvious that clients are under attack and
can potentially be disclosed, too. However, this is not exactly the case. Whereas all GET-requests are stored and available to any unauthenticated
attacker, it is often hard to find a correlation between the user and his actions. Accessing a service under Tor will log a record with 127.0.0.1 as
visitor’s IP address - and even if there were external IP-addresses, they would be just exit nodes.
For example, on the following screenshot one can see current requests to a variant of “Dark Google”:</p>
<p><img class="center" src="http://antukh.com/images/6_dark_google.png" width="900" title="image" alt="images"></p>
<p>This information can be rather used for statistics than for actual de-anonymization. I don’t feel to
publish it here but my observations show that probably there is a reason why such Tor services are called “dark”.</p>
<p>The following scenario is more of help though, since knowing the “key” parameter value serves as auth method and
lets anybody knowing it see the confirmation of the order - along with the cost, BTC addresses of both parties,
purchased product and customer details. Data from log:</p>
<p><img class="center" src="http://antukh.com/images/6_log_client.png" width="900" title="image" alt="images"></p>
<p>Actual order along with client data:</p>
<p><img class="center" src="http://antukh.com/images/6_client_info.png" width="900" title="image" alt="images"></p>
<p>Although “pure” hidden service can rarely be used for successful DA with this method (probably if not taking in consideration stupid authorization mistakes and
poor authentication schemes), it becomes trivial in case of ordinary non-Tor services are in place. Moreover, in the next examples presence of a hidden
service actually decreases overall security!</p>
<h3>Riseup</h3>
<p><em>Your riseup.net email account is a wonderful thing. Although we don’t provide as much storage quota as surveillance-funded corporate email providers, riseup.net
email has many unusual features: <…> we do not log internet addresses of anyone using riseup.net services, including email.</em> *</p>
<p><i><p align="right"><em>* welcome email for newcomers</em></p></i></p>
<p>Riseup has three types of accounts sorted by security level: green (lists, wiki), red (email, shell, OpenVPN) and black (Bitmask enhanced security).
In this section I will concentrate on red and black accounts, since green ones do not seem to have that much importance in terms of privacy.</p>
<p>Brief information:
incorrect configuration of nearly all of HTTP(S) services (listed
here: <a href="https://help.riseup.net/security/network-security/tor/hs-addresses-signed.txt">https://help.riseup.net/security/network-security/tor/hs-addresses-signed.txt</a>)
allows an attacker to disclose IP addresses of currently active users,
their contacts from address book, login name at the moment, and in case of black.* service - to
find a correlation between unique hashed ID of the user and his real
login, to further spy on his activities. According to the earliest
references I was able to find, this vulnerability was in place at
least from 2012, which presents critical privacy risk to all Riseup
users. Since it is quite trivial to set up monitoring of all the
requests made to the systems, probably many of the users were already
successfully de-anonymized during those 3+ years.</p>
<h4>Vulnerable services in “darknet”:</h4>
<ul>
<li><a href="http://nzh3fv6jc6jskki3.onion/server-status">http://nzh3fv6jc6jskki3.onion/server-status</a> - help.*, lyre.*, riseup.net</li>
<li><a href="http://cwoiopiifrlzcuos.onion/server-status">http://cwoiopiifrlzcuos.onion/server-status</a> - black.*, api.black.*</li>
<li><a href="http://zsolxunfmbfuq7wf.onion/server-status">http://zsolxunfmbfuq7wf.onion/server-status</a> - cotinga.*, mail.*</li>
<li><a href="http://yfm6sdhnfbulplsw.onion/server-status">http://yfm6sdhnfbulplsw.onion/server-status</a> - labs.*, bugs.otr.im</li>
<li><a href="http://xpgylzydxykgdqyg.onion/server-status">http://xpgylzydxykgdqyg.onion/server-status</a> - lists.*, whimbrel.*</li>
<li><a href="http://j6uhdvbhz74oefxf.onion/server-status">http://j6uhdvbhz74oefxf.onion/server-status</a> - user.*</li>
<li><a href="http://7lvd7fa5yfbdqaii.onion/server-status">http://7lvd7fa5yfbdqaii.onion/server-status</a> - we.*</li>
</ul>
<h4>Sample data, which can be disclosed:</h4>
<p><em>RED</em> : remote IP address of the current user, his actions and address book contacts</p>
<p><img class="center" src="http://antukh.com/images/6_real_ip_address_book.png" width="900" title="image" alt="images"></p>
<p><em>RED</em> : currently logged in user, and his actions</p>
<p><img class="center" src="http://antukh.com/images/6_current_user_login.png" width="900" title="image" alt="images"></p>
<p><em>BLACK</em> : correlation between real user login and his unique hash ID, which
is used later to anonymize all the activities he makes</p>
<p><img class="center" src="http://antukh.com/images/6_login_hash_id.png" width="900" title="image" alt="images"></p>
<h3>Megafon</h3>
<p>As you might deduce, Megafon, one of the largest Russian mobile operators, is affected too. In this case,
it was set of old subscription services along with WAP.</p>
<p><img class="center" src="http://antukh.com/images/6_megafon_general_info.png" width="900" title="image" alt="images"></p>
<p>Brief information:
it is possible for non-authenticated user to disclose list of all current connections (active users w/mobile phone numbers),
internal pages of vulnerable services, information about current transactions and admin credentials for the services.</p>
<h4>Vulnerable hosts on the same server:</h4>
<ul>
<li>wap.megafonpro.ru</li>
<li>podpiskipro.ru</li>
<li>iclickpro.ru</li>
</ul>
<h4>Sample data, which can be disclosed:</h4>
<p>General user activity with phone numbers:</p>
<p><img class="center" src="http://antukh.com/images/6_megafon_user_activity.png" width="900" title="image" alt="images"></p>
<p>Admin credentials to vulnerable services:</p>
<p><img class="center" src="http://antukh.com/images/6_megafon_admin_passwords.png" width="900" title="image" alt="images"></p>
<p><em>Disclaimer: admin credentials were not used by me to break into the system, however, log analysis has shown
that further attack on other Megafon systems is very likely from there.</em></p>
<h3>Why is it happening?</h3>
<p>The answer looks simple - because of default Apache configuration. For the sake of an experiment I’ve setup a freshly installed
Apache server to host both normal and hidden services. Here are the results for normal service:</p>
<p><img class="center" src="http://antukh.com/images/6_normal_service.png" width="900" title="image" alt="images"></p>
<p>And here are those for hidden one:</p>
<p><img class="center" src="http://antukh.com/images/6_hidden_service.png" width="900" title="image" alt="images"></p>
<p>The reason is that for Apache any connection from Tor browser is considered to be initiated from localhost (we’ve seen that on all
previous faulty screenshots, remember?). Default configuration of <em>status.conf</em> is as follows:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class=''><span class='line'><Location /server-status>
</span><span class='line'> SetHandler server-status
</span><span class='line'> Order deny,allow
</span><span class='line'> Deny from all
</span><span class='line'> Allow from 127.0.0.1 ::1
</span><span class='line'># Allow from 192.0.2.0/24
</span><span class='line'></Location></span></code></pre></td></tr></table></div></figure>
<p>In other words, since connection seems to come from a trustworthy source, it simply allows anyone to access it!
Moreover, any user navigating on Tor seems to be more trusted than in non-Tor conditions, as both server and many applications just trust 127.0.0.1 more as local.
<a href="http://blog.ircmaxell.com/2012/11/anatomy-of-attack-how-i-hacked.html">How I hacked StackOverflow</a>, indeed.</p>
<p>So, for this specific case it might be enough to disable <em>/server-status</em> or comment the line “Allow from 127.0.0.1 ::1”. However, the problem is deeper - due
to the architecture of Tor solution, <strong>all the applications in darknet</strong> have to be reviewed to make sure there is no excessive trust to “local attacker”, and
it sounds like the fight “Security vs. Privacy” continues.</p>
<p><img class="center" src="http://antukh.com/images/6_tor_project.png" width="900" title="image" alt="images"></p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Deanonymization Made Simple]]></title>
<link href="http://antukh.com/blog/2015/08/22/dark-appsec/"/>
<updated>2015-08-22T21:44:25+02:00</updated>
<id>http://antukh.com/blog/2015/08/22/dark-appsec</id>
<content type="html"><![CDATA[<p><a href="https://twitter.com/c0rdis/status/630705659848302592">cbcf9dde327c475d99627c87f58cab7ac6689164bf2fe7734c10c78005ed118e</a> == sha256(“[10.08.2015] I’ve discovered that about 2% of the known darkweb is controlled by one organization.”)</p>
<p><img class="center" src="http://antukh.com/images/5_dark_web.jpg" width="600" title="image" alt="images"></p>
<!--more-->
<p>Reading articles of deanonymization of hidden services by <a href="http://40.media.tumblr.com/cd025e4b9b6db50cf53d21a7af5e5568/tumblr_nq7y4s0aMQ1uygu1vo1_1280.png">controlling certain nodes</a>
or conducting <a href="http://conference.hitb.org/hitbsecconf2015ams/wp-content/uploads/2015/02/D2T2-Filippo-Valsorda-and-George-Tankersly-Non-Hidden-Hidden-Services-Considered-Harmful.pdf">correlation attacks</a>,
I came to an idea that in certain cases it might be much easier to break anonymity. Just by having the same vulnerabilities as in “clearnet”, applications can expose sensitive information
and let an attacker gather data from the system and deanonymize the target, with certain “darknet” specifics in the approach.</p>
<p>According to the results of the recent HyperionGray research of <a href="http://alex.hyperiongray.com/posts/289994-scanning-the-dark-web">scanning the darkweb with PunkSPIDER</a>,
approximate number of alive dark services is about 7000. The guys took alive and not-so-hidden services and started to scan those for serious vulnerabilities.
I’ve started my own research with slightly different approach - in opposite to searching for critical vulnerabilities like OSCI/SQLi,
I’ve taken a closer look to conventionally low-risk information disclosure.</p>
<p>For that I’ve written a simple Python script which, when provided with server/framework, would enumerate accessible files and folders and probably discover certain leaks of server information.
To my surprise, fair amount of them actually had quite lame generic server authorization/configuration issues up to world-readable <em>/phpinfo.php</em>.</p>
<p><img class="center" src="http://antukh.com/images/5_phpinfo_hacking.png" width="900" title="image" alt="images"></p>
<p>The most helpful and common fail pattern was, however, the default Apache pages such as <em>/server-info</em> and <em>/server-status</em>. Whereas the first
one would give you a nice picture of the server information with current settings, modules and its configuration (and IP address, of course),
the second is more valuable in terms of current connections. <strong>In a given set of 7k+ alive services almost 500 of them (about 7%) appeared to be
vulnerable.</strong> Further analysis showed that large-traffic applications are affected, too.</p>
<table>
<tr>
<td><img class="left" src="http://antukh.com/images/5_status_traffic.jpg" width="350" title="image" alt="images"></td>
<td><img class="right" src="http://antukh.com/images/5_status_traffic_2.png" width="420" title="image" alt="images"></td>
</tr>
</table>
<p></p>
<p>For one of the websites I’ve noticed, that it has several other hosts with completely different kinds
of subjects. The only thing which was the same, were those <em>/server-status</em> pages all among them. Quick gather of references on those revealed more than 300
unique services with traffic as much as 50+ Gb per day. Interestingly enough, most of them were referenced from HiddenWiki page,
which also resided on the same server. A weaver! As appeared later, it was a hidden hosting service, where anybody could pay certain amount of BTC and rent
it for his own dark intentions. Obviously, such disclosure makes it possible for deanonymizer to list all the queries to a particular domain on the hosting
server and view parameters with corresponding values for GET requests with full paths to closed parts of the application.</p>
<p>I was lucky again when my script warned me of an external IP address, which accessed <em>“vps.server.com”</em>.
If you’ve ever had a look to access.log of your web server, for sure you’ve noticed a lot of connections of all kinds of bots which scan the Internet for
vulnerabilities. That was probably the first time in my life, when I was really thankful to them. It meant the following:</p>
<ul>
<li>clearnet service is also available on port 80</li>
<li>if I manage to access it, my watcher script can isolate it</li>
</ul>
<p>One of the options to hit that is to basically try to scan the whole Internet on port 80. Sounds crazy? Hold on, check these projects first:
<a href="https://zmap.io/">Zmap</a> and <a href="https://github.com/robertdavidgraham/masscan">Massscan</a>!</p>
<p>What’s basically needed, is to access a specific IP address with certain marker, which would identify this IP address uniquely, and monitor
such access on <em>/server-status</em> of a target server. I assumed that probably the easiest way to do it is to use the following vector: <a href="http://xx.xx.xx.xx/xx.xx.xx.xx.">http://xx.xx.xx.xx/xx.xx.xx.xx.</a>
Results haven’t made me wait too long:</p>
<p><img class="center" src="http://antukh.com/images/5_ip_access.png" width="800" title="image" alt="images"></p>
<p>Of course, this is not the only way to achieve that. The following scenario is even simpler: many clearnet hosts on the same server are used to redirect traffic to darknet, and
this also helps a lot to deanonymize the target. One approach is quite similar to the previous one but more universal in a way that you don’t really need to have control over status page.
It is enough to parse those responses, which return <em>30x</em> code, and check for presense of “.onion” string in the “Location:” header:</p>
<p><img class="center" src="http://antukh.com/images/5_location.png" width="800" title="image" alt="images"></p>
<p>For the laziest of researchers, Shodan might help, too:</p>
<p><img class="center" src="http://antukh.com/images/5_shodan.png" width="800" title="image" alt="images"></p>
<p>Finally, researcher can always find a vulnerability in one weak service, and get access to the whole hosting server. Let’s say, I believe it’s possible ;)</p>
<p><img class="center" src="http://antukh.com/images/5_shell.jpg" width="800" title="image" alt="images"></p>
<h2>Conclusion</h2>
<p>The goal of my research was to show that often deanonymization of a hidden service (or even a network) can be done trivially by applying the same pentest approach as in clearnet.
Main difference here is that usually non-critical information disclosure plays much more significant role than for “normal” web applications.
To summarize, at least the following easy ways may let researcher deanonymize a darknet service:</p>
<ul>
<li>instant win (server-info, phpinfo, …)</li>
<li>status page access (x.x.x.x/x.x.x.x)</li>
<li>(un)expected redirect (30x clearnet to darknet)</li>
<li>app-level pwnage (missing patches, vulnerabilities in the code, default framework pages…)</li>
</ul>
<p>P.S. If you’re interested in the topic, you may also want to check <a href="https://www.thecthulhu.com/setting-up-a-hidden-service-with-nginx/">TheCtulhu’s blog</a> and find decent instructions on configuring nginx server to host a hidden service in a more secure way.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Personal CyberAngel]]></title>
<link href="http://antukh.com/blog/2015/02/23/angel/"/>
<updated>2015-02-23T20:19:26+01:00</updated>
<id>http://antukh.com/blog/2015/02/23/angel</id>
<content type="html"><![CDATA[<p>We all know how frustrating account theft could be. Just imagine - you read the news about <a href="http://gadgets.ndtv.com/internet/news/nearly-7-million-dropbox-account-passwords-reportedly-leaked-606494">yesterday’s successful attack</a>
on some service with full database dump published on Pastebin, and you suddenly notice
that your e-mail is listed there too…
What if the news are one week/month/year old, and you didn’t change your password since registration?</p>
<p><img class="center" src="http://antukh.com/images/4_angel.png" width="700" title="image" alt="images"></p>
<!--more-->
<p>With free <a href="https://aan.sh/angel">Personal CyberAngel</a> service you can minimize the risks - notifications about any mentions of your e-mail/Twitter account on Pastebin-like
websites and hacker forums will be immediately sent to you, so you could take prompt actions and save a lot of your time recovering access to it.
As a good angel, this one is constantly working and self-improving - list of leak sources is updated regularly with short breaks for coffee and blessings.</p>
<p>All my <a href="https://twitter.com/c0rdis">Twitter followers</a> get account subscription automatically - a private direct message will be
sent should any reference to your account be published. In case you would like to monitor your e-mail
address too, or make any suggestions, just let me know privately. Stay safe!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[One-time Notes]]></title>
<link href="http://antukh.com/blog/2015/02/05/otnotes/"/>
<updated>2015-02-05T08:30:07+01:00</updated>
<id>http://antukh.com/blog/2015/02/05/otnotes</id>
<content type="html"><![CDATA[<p>Always wanted to have my own version of Privnote to be sure of how the data is handled on the server… Finally, <a href="https://aan.sh/otnote">here it is</a>.</p>
<h3>Description:</h3>
<ul>
<li>connection is secured by HTTPS</li>
<li>note is encrypted on the client side with <a href="https://keybase.io/triplesec/">Triplesec</a> (Salsa20 + AES + Twofish) with randomly generated key - the server doesn’t know what’s inside</li>
<li>due to heavy crypto, it will take >$100k to break a single note</li>
<li>upon successful submission, a URL type of <strong>{token}#{key}</strong> is generated</li>
<li>direct access of the generated URL will show the decrypted note (using the <strong>{key}</strong>)</li>
<li>for security purposes, you may want to send the link without the key, so the receiver will have to enter the key manually to decrypt your message</li>
<li>there is only one shot - once the URL is accessed, the note is permanently deleted from the server. Additionally, notes auto-expire in 72 hours after creation.</li>
</ul>
<p>Hope you’ll find it useful.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Easy Way to Get KDF (Krypto-Dog Food)]]></title>
<link href="http://antukh.com/blog/2015/01/26/krypto-dog-food/"/>
<updated>2015-01-26T20:57:42+01:00</updated>
<id>http://antukh.com/blog/2015/01/26/krypto-dog-food</id>
<content type="html"><![CDATA[<p>My recent <a href="http://antukh.com/blog/2015/01/17/cryptosocial-network-from-the-inside/">Keybase overview</a> gave me an impulse to read more about KDFs, their implementations and modern applications, which I’m going to present in the following post.</p>
<p><img class="center" src="http://antukh.com/images/2_krypto_dog.jpg" width="333" title="image" alt="images"></p>
<p>KDF is a Key Derivation Function. As follows from the definition, such function is used to derive one or more keys from some secret value - <em>source of initial keying material</em>.
Derived keys can then be used in different ways, such as to encrypt other important data, to built a MAC, or even as-is.
One example of using KDF is to generate a session key during TLS handshake.</p>
<!--more-->
<p>Main functionality built in KDFs can be described as X-X: <em>extract-and-expand</em> paradigm.
<em>Extraction</em> module takes non-uniformly random or pseudorandom source keying material as input and, by applying some function, “extracts” uniform key to operate with as primary input.
This step can be omitted in case of initial keying material is already uniform, but it’s often not the case.
<em>Expansion</em> module, in turn, operates with previously generated (pseudo)random key and uses it to seed some function (not just any - see <a href="https://crypto.stanford.edu/~dabo/cs255/lectures/PRP-PRF.pdf">PRPs and PRFs</a>) to produce additional keys - those we expect to be derived.</p>
<p>Based on provided initial keying material, KDFs are divided into two large groups:</p>
<h3>KDFs based on some source key</h3>
<p>These KDFs take source key as input, which is assumed to have enough entropy.</p>
<p>“Traditional” KDF scheme operates with perfect source keys (those which do not need extraction step).
Its additional input parameters are CTX (context string, depends on current application) and CTR (counter), and the scheme is based on simple concatenation of pseudorandom function (secure PRF) output.
With such a function one could generate as many bits/keys as needed, and just cut off the rest when the goal is achieved.
It can generally be described with the following pseudocode:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">KDF</span><span class="p">(</span><span class="n">SK</span><span class="p">,</span><span class="n">CTX</span><span class="p">,</span><span class="n">CTR</span><span class="p">)</span> <span class="o">=</span> <span class="n">K</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">||</span> <span class="n">K</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">||</span> <span class="o">...</span> <span class="o">||</span> <span class="n">K</span><span class="p">(</span><span class="n">CTR</span><span class="p">)</span>
</span><span class='line'> <span class="n">where</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span><span class="n">SK</span><span class="p">,(</span><span class="n">CTX</span><span class="o">||</span><span class="mi">0</span><span class="p">))</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span><span class="n">SK</span><span class="p">,(</span><span class="n">CTX</span><span class="o">||</span><span class="mi">1</span><span class="p">))</span>
</span><span class='line'> <span class="o">...</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="n">CTR</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span><span class="n">SK</span><span class="p">,(</span><span class="n">CTX</span><span class="o">||</span><span class="n">CTR</span><span class="p">))</span>
</span></code></pre></td></tr></table></div></figure>
<p>The best known example of “non-perfect-key-input” KDFs is HKDF, or <a href="https://eprint.iacr.org/2010/264.pdf">HMAC-based Key Derivation Function.</a>
Steps to derive the keys include extraction (full X-X scheme) and use of HMAC as secure PRF in traditional KDF scheme.
Additionally, previously derived key is used as input to generate the succeeding one:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">Extraction</span> <span class="p">:</span> <span class="n">k</span> <span class="o">=</span> <span class="n">HMAC</span><span class="p">(</span><span class="n">salt</span><span class="p">,</span> <span class="n">SK</span><span class="p">)</span>
</span><span class='line'><span class="n">Expansion</span> <span class="p">:</span> <span class="n">HKDF</span><span class="p">(</span><span class="n">k</span><span class="p">,</span><span class="n">CTX</span><span class="p">,</span><span class="n">CTR</span><span class="p">)</span> <span class="o">=</span> <span class="n">K</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">||</span> <span class="n">K</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">||</span> <span class="o">...</span> <span class="o">||</span> <span class="n">K</span><span class="p">(</span><span class="n">CTR</span><span class="p">)</span>
</span><span class='line'> <span class="n">where</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">=</span> <span class="n">HMAC</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="p">(</span><span class="n">CTX</span><span class="o">||</span><span class="mi">0</span><span class="p">))</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">=</span> <span class="n">HMAC</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="p">(</span><span class="n">K</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">||</span> <span class="n">CTX</span> <span class="o">||</span> <span class="mi">1</span><span class="p">))</span>
</span><span class='line'> <span class="o">...</span>
</span><span class='line'> <span class="n">K</span><span class="p">(</span><span class="n">CTR</span><span class="p">)</span> <span class="o">=</span> <span class="n">HMAC</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="p">(</span><span class="n">K</span><span class="p">(</span><span class="n">CTR</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">||</span> <span class="n">CTX</span> <span class="o">||</span> <span class="n">CTR</span><span class="p">))</span>
</span></code></pre></td></tr></table></div></figure>
<p>KDFs of this type are commonly used for key diversification and are not acceptable for password storage.</p>
<h3>Password-Based KDF</h3>
<p>Another group of key derivation functions is based on user-supplied password as input.
Since passwords do not have sufficient entropy, traditional KDFs as well as HKDF should not be used in this case, as derived keys will be vulnerable to dictionary attacks.
To deal with a problem and compensate input weakness, two main PBKDF defenses were developed: use of salt and <em>slow hash functions</em>.</p>
<p>Traditional approach to slow down the calculations here is based on increased number of iterations - so fast hash function runs over and over again until acceptable latency is reached.
PCKS#5 describes PBKDF as follows:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">PBKDF1</span> <span class="o">=</span> <span class="n">H</span><span class="p">{</span><span class="n">c</span><span class="p">}(</span><span class="n">password</span><span class="o">||</span><span class="n">salt</span><span class="p">)</span> <span class="o">=</span> <span class="n">H</span><span class="p">(</span><span class="n">H</span><span class="p">(</span><span class="n">H</span><span class="p">(</span><span class="n">H</span><span class="p">(</span><span class="n">H</span><span class="o">...</span><span class="p">(</span><span class="n">H</span><span class="p">(</span><span class="n">password</span> <span class="o">||</span> <span class="n">salt</span><span class="p">))</span><span class="o">...</span><span class="p">))))</span>
</span><span class='line'>
</span><span class='line'>
</span><span class='line'><span class="n">PBKDF2</span> <span class="o">=</span> <span class="n">T</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">||</span> <span class="n">T</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">||</span> <span class="o">...</span> <span class="o">||</span> <span class="n">T</span><span class="p">(</span><span class="n">L</span><span class="p">)</span>
</span><span class='line'> <span class="n">where</span>
</span><span class='line'> <span class="n">L</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">desired_key</span><span class="p">)</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">PRF_output</span><span class="p">)</span>
</span><span class='line'> <span class="n">T</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">=</span> <span class="n">U</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">^</span> <span class="n">U</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">^</span> <span class="o">...</span> <span class="o">^</span> <span class="n">U</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</span><span class='line'> <span class="n">U</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span> <span class="n">password</span><span class="p">,</span> <span class="n">salt</span> <span class="o">||</span> <span class="n">INT_32_BE</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">)</span>
</span><span class='line'> <span class="n">U</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span> <span class="n">password</span><span class="p">,</span> <span class="n">U</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">)</span>
</span><span class='line'> <span class="o">...</span>
</span><span class='line'> <span class="n">U</span><span class="p">(</span><span class="n">c</span><span class="p">)</span> <span class="o">=</span> <span class="n">PRF</span><span class="p">(</span> <span class="n">password</span><span class="p">,</span> <span class="n">U</span><span class="p">(</span><span class="n">c</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>
<p>PBKDF1 is considered obsolete and currently replaced by its successor PBKDF2, as it could only produce derived keys of fixed, limited length.
PBKDF2 in turn is used in many popular encryption implementations, including WPA/WPA2 to secure WiFi networks, Mac OS X for user passwords, Android filesystem encryption and many more.
Typical WPA2 is based on HMAC-SHA1 PRF with network SSID as salt and number of iterations c = 4096, and produces 256-bit key.</p>
<p>Nevertheless, since these functions require very little RAM, ASIC/GPU brute-force attacks are relatively cheap and effective against them, so more powerful KDFs were needed.
The two most popular implementation of enhanced PBKDFs are bcrypt and scrypt which are described below.</p>
<h3>bcrypt</h3>
<p>First presented in 1999, this KDF is the default password hash algorithm in BSD and many other systems.
It is based on Blowfish - fast block cipher with a notable remark - key changing is very slow there.
Each new key requires pre-processing equivalent to encrypting about 4 kilobytes of text, which is very slow compared to other block ciphers.
Altough that might be a problem for small embedded systems (e.g. some smartcards), such approach turned into benefit in PBKDFs: in order to conduct successful dictionary attack, much more time is needed.</p>
<p>Simplified algorithm of bcrypt is presented below.</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
</pre></td><td class='code'><pre><code class='c'><span class='line'><span class="n">EksBlowfishSetup</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span> <span class="n">salt</span><span class="p">,</span> <span class="n">key</span><span class="p">)</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'> <span class="n">state</span> <span class="o">=</span> <span class="n">InitState</span><span class="p">()</span>
</span><span class='line'> <span class="n">state</span> <span class="o">=</span> <span class="n">ExpandKey</span><span class="p">(</span><span class="n">state</span><span class="p">,</span><span class="n">salt</span><span class="p">,</span><span class="n">key</span><span class="p">)</span>
</span><span class='line'> <span class="n">repeat</span> <span class="mi">2</span><span class="o">^</span><span class="nl">cost</span><span class="p">:</span>
</span><span class='line'> <span class="n">state</span> <span class="o">=</span> <span class="n">ExpandKey</span><span class="p">(</span><span class="n">state</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="n">salt</span><span class="p">)</span>
</span><span class='line'> <span class="n">state</span> <span class="o">=</span> <span class="n">ExpandKey</span><span class="p">(</span><span class="n">state</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="n">key</span><span class="p">)</span>
</span><span class='line'> <span class="k">return</span> <span class="n">state</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'>
</span><span class='line'><span class="n">bcrypt</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span> <span class="n">salt</span><span class="p">,</span> <span class="n">pwd</span><span class="p">)</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'> <span class="n">state</span> <span class="o">=</span> <span class="n">EksBlowfishSetup</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span><span class="n">salt</span><span class="p">,</span><span class="n">key</span><span class="p">)</span>
</span><span class='line'> <span class="n">ctext</span> <span class="o">=</span> <span class="s">"OrpheanBeholderScryDoubt"</span>
</span><span class='line'> <span class="n">repeat</span> <span class="mi">64</span><span class="o">:</span>
</span><span class='line'> <span class="n">ctext</span> <span class="o">=</span> <span class="n">EncryptECB</span><span class="p">(</span><span class="n">state</span><span class="p">,</span><span class="n">ctext</span><span class="p">)</span>
</span><span class='line'> <span class="k">return</span> <span class="n">Concatenate</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span><span class="n">salt</span><span class="p">,</span><span class="n">ctext</span><span class="p">)</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>Comparing to “classic” PBKDF, bcrypt requires larger (but still fixed to 4KB) amount of RAM and is slightly stronger against brute-force attacks.
However, while bcrypt does a decent job at making life difficult for a GPU-enhanced attacker, it does little against a <a href="http://www.openwall.com/presentations/Passwords14-Energy-Efficient-Cracking/">FPGA-wielding attacker.</a></p>
<h3>scrypt</h3>
<p>The scrypt key derivation function was originally developed for use in the Tarsnap online backup system.
It can use arbitrarily large amounts of memory and is therefore much more resistant to hardware brute-force attacks than alternative functions such as PBKDF2 or bcrypt.
An amazing concept of <em>sequential memory-hard functions</em> is applied there - those can be computed by algorithms which use largest amount of storage possible, and cannot be parallelized effectively.</p>
<p>This can really be considered as breakthrough - by estimations, cost of cracking scrypted 8-char password in a year is <a href="http://www.tarsnap.com/scrypt/scrypt.pdf">approximately 4400 times more expensive</a>
than one converted with bcrypt.</p>
<p>Its simplified structure can be described with the following pseudocode:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
</pre></td><td class='code'><pre><code class='c'><span class='line'><span class="n">scrypt</span><span class="p">(</span><span class="n">password</span><span class="p">,</span> <span class="n">salt</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">r</span><span class="p">,</span> <span class="n">keyLen</span><span class="p">)</span>
</span><span class='line'><span class="c1">// N - number of iterations for slow hash function (CPU cost)</span>
</span><span class='line'><span class="c1">// p - how many blocks is used (parallelization cost)</span>
</span><span class='line'><span class="c1">// r - size of each block (memory cost)</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'> <span class="n">blockLen</span> <span class="o">=</span> <span class="mi">1024</span><span class="o">*</span><span class="n">r</span>
</span><span class='line'> <span class="n">iterCount</span> <span class="o">=</span> <span class="mi">1</span>
</span><span class='line'> <span class="n">B</span> <span class="o">=</span> <span class="n">PBKDF2</span><span class="p">(</span><span class="n">HMAC_SHA256</span><span class="p">,</span> <span class="n">password</span><span class="p">,</span> <span class="n">salt</span><span class="p">,</span> <span class="n">iterCount</span><span class="p">,</span> <span class="n">p</span><span class="o">*</span><span class="n">blockLen</span><span class="p">)</span>
</span><span class='line'> <span class="k">for</span> <span class="n">i</span> <span class="n">in</span> <span class="n">range</span><span class="p">(</span><span class="n">p</span><span class="p">)</span><span class="o">:</span>
</span><span class='line'> <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">ROMix</span><span class="p">(</span><span class="n">Salsa20</span><span class="p">(</span><span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]),</span> <span class="n">N</span><span class="p">)</span>
</span><span class='line'> <span class="k">return</span> <span class="n">PBKDF2</span><span class="p">(</span><span class="n">HMAC_SHA256</span><span class="p">,</span> <span class="n">password</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">iterCount</span><span class="p">,</span> <span class="n">keyLen</span><span class="p">)</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>Let’s look at the internals.
The large memory requirements of scrypt come from a large vector of pseudorandom bit strings that are generated as part of the algorithm (ROMix routine iteratively calls <em>BlockMix()</em> function which is desribed below).
Once the vector is generated, the elements of it are accessed in a pseudo-random order and combined to produce the derived key.</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="n">BlockMix</span><span class="p">(</span> <span class="n">Salsa20</span><span class="p">,</span> <span class="n">B</span> <span class="p">)</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'> <span class="n">X</span> <span class="o">=</span> <span class="n">inverse</span><span class="p">(</span><span class="n">B</span><span class="p">)</span>
</span><span class='line'> <span class="n">Y</span> <span class="o">=</span> <span class="p">[]</span>
</span><span class='line'> <span class="k">for</span> <span class="n">bi</span> <span class="ow">in</span> <span class="n">B</span><span class="p">:</span>
</span><span class='line'> <span class="n">X</span> <span class="o">=</span> <span class="n">Salsa20</span><span class="p">(</span><span class="n">X</span> <span class="o">^</span> <span class="n">bi</span><span class="p">)</span>
</span><span class='line'> <span class="n">Y</span> <span class="o">+=</span> <span class="n">X</span>
</span><span class='line'> <span class="k">return</span> <span class="n">Y</span><span class="p">[</span><span class="mi">0</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> <span class="o">+</span> <span class="n">Y</span><span class="p">[</span><span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>In order to get rid of large memory requirements, there is a significant trade-off in speed.
This sort of time–memory trade-off can often be met in computer algorithms: you can increase speed at the cost of using more memory, or decrease memory requirements at the cost of performing more operations and taking longer.
The idea behind scrypt is to deliberately make this trade-off costly in either direction.
Thus an attacker could use an implementation that doesn’t require many resources (and can therefore be massively parallelized with limited expense) but runs very slowly, or use an implementation that runs more quickly but
has very large memory requirements and is therefore more expensive to parallelize. The main risk here (as always) is wrong implementation / poorly chosen parameters, which could reduce its <a href="http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-scrypt.html">comparative benefits to zero.</a></p>
<p>Although scrypt is somewhat new KDF, it’s already rather common in the real-world applications.
The first application was of course Tarsnap (N=2<sup>14</sup>, p=1, r=8) - secure online backup service, company-creator of scrypt.
Probably the most well-known implementations are cryptocurrencies - Litecoin (N=2<sup>10</sup>, p=1, r=1), YACoin (N=2<sup>15</sup>, p=1, r=1) and many other altcoins.
I feel I should mention Keybase application (N=2<sup>15</sup>, p=1, r=8) too, as it was the reason why I actually decided to write the article :)</p>
<h2>Resume</h2>
<p>KDFs is a nice cryptographic concept which, being properly implemented, can significantly improve overall level of security.
It is important to understand when it is better to use KDFs in place of other crypto primitives.
Notable applications of key derivation functions are:</p>
<ol>
<li>Key diversification - obtaining additional keys from source key</li>
<li>Key stretching / key strengthening - password hashing approach to replace existing common hash functions used for verification</li>
<li>Basis for a system RNG to seed a pseudorandom generator (PRG).</li>
<li>Components of multiparty key-agreement protocols</li>
</ol>
<h2>Q&A section</h2>
<h4>What is the actual difference between hash and KDF?</h4>
<p>First of all, KDFs are more general in their applications, and not all KDFs should actually replace hashes (for example, HKDF is primarily used for secondary keys derivation).
Comparing to password-based KDFs, hashes are usually weaker (they do not satisfy randomness requirements) and, most important, are easier to bruteforce.
Modern PBKDFs are much slower by design and thus should be given preference over simple hashing.</p>
<h4>Why do we need additional keys if we already have strong source key?</h4>
<p>This is quite common scenario when additional keys are needed - e.g. in TLS (unidirectional keys: MAC key, encryption key, IV…) or in CBC (nonces).
In case an attacker obtains a derived key, he or she is not able to deduce either the input secret value or any of the other derived keys.</p>
<h4>Why in the world do we need our input to be uniformly random in order to derive additional keys?</h4>
<p>Well, it turns out that if the condition of input randomness is not met, the output might not look random.
In context of keys used to secure the sessions, it might be possible for an attacker to anticipate some of the session keys and thereby break the session.</p>
<p>P.S. If you still have any questions, feel free to ask, and I’ll publish answers here.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Anti-debugging Techniques Cheat Sheet]]></title>
<link href="http://antukh.com/blog/2015/01/19/malware-techniques-cheat-sheet/"/>
<updated>2015-01-19T18:23:45+01:00</updated>
<id>http://antukh.com/blog/2015/01/19/malware-techniques-cheat-sheet</id>
<content type="html"><![CDATA[<p>It’s been quite a while I analyzed malware last time, so I decided to refresh my knowledge and write a short post on common x86 malware anti-debugging techniques.
Techniques here do not include obfuscation like false branches, spaghetti code etc., and present an extract of popular ways to kick debugger’s ass.
Please note: this is not a complete set of techniques and rather “shortcuts” than a guide.
If you’d like to read more in details, I’ve provided links to some great antidbg materials in the end of the post.
Feel free to contact me to complete the list with undescribed technique and/or correct already described ones!</p>
<!--more-->
<h3>Before we start, a little refreshment on breakpoints (OllyDbg has been taken as an example, although it’s true for most of debuggers):</h3>
<ul>
<li>software breakpoints - replacing original instruction with 0xCC and raising interrupt routine for debugger to handle it</li>
<li>hardware breakpoints - DR0…DR4 debug registers provided by the processor, as one of them is reached, INT 1 interrupt is raised by OS</li>
<li>memory breakpoints - guard pages, exception handler is called when accessing the specified page</li>
</ul>
<h2>Common anti-debugging techniques:</h2>
<h3>0) Straight checks for breakpoints</h3>
<ul>
<li><em>Detection of 0xCC bytes.</em> Checks may include comparison to xor’ed value too, e.g. to 0x99 (0xCC ^ 0x55)</li>
<li><em>Detection of hardware breakpoints.</em> Basically two methods are most common here. The first includes GetThreadContext/SetThreadContext API and direct check for DRs. Second method is used to set up own SEH, cause exception (i.e. xor eax,eax / div eax) and get direct access to debug registers in the handler as offset to context structure.</li>
<li><em>Detection of guard pages</em> is somewhat rare and based on imitation of debugger behavior - i.e. creation of PAGE_GUARD memory page and accessing it, previously put return address onto the stack. If STATUS_GUARD_PAGE_VIOLATION occurs, it’s assumed no debugging is in place.</li>
</ul>
<h3>1) API calls</h3>
<ul>
<li><em>IsDebuggerPresent</em> - probably the most well-known technique and one of the easiest to bypass. This API checks specific flag in PEB and returns TRUE/FALSE based on the result.</li>
<li><em>CheckRemoteDebuggerPresent</em> - same functionality as previous - simple bool function, straight use</li>
<li><em>FindWindow</em> - used to detect specific debuggers - for instance, OllyDbg window class is named “OLLYDBG” :) Other popular debuggers classes checks include “WinDbgFrameClass”, “ID”, “Zeta Debugger”, “Rock Debugger” and “ObsidianGUI”</li>
<li><em>NtQueryObject</em> - detection is based on “debug objects”. API queries for the list of existing objects and checks the number of handles associated with any existing debug object</li>
<li><em>NtQuerySystemInformation (ZwQuerySystemInformation)</em> - similar to previous point - checks if debug object handle exists and returns true if it’s the case</li>
<li><em>NtSetInformationThread (ZwSetInformationThread)</em> - the first anti-debugging API implemented by Windows. Class HideThreadFromDebugger, when passed as an argument, can be used to prevent debuggers from receiving events (include breakpoints and exiting the program) from any thread that has this API called on it.</li>
<li><em>NtContinue</em> and similar functions are used modify current context or load a new one in the current thread, which can confuse debugger.</li>
<li><em>CloseHandle and NtClose</em> - a very cool technique based on the fact that call of ZwClose with invalid handle generates STATUS_INVALID_HANDLE exception when the process is debugged.</li>
<li><em>GenerateConsoleCtrlEvent</em> - event-based detection. One vector is to invoke Ctrl-C signal and check for EXCEPTION_CTL_C exception (which is true if the process is debugged)</li>
<li><em>OutputDebugString</em> with a valid ASCII strings - causes error when no debugger is present, otherwise passes normally. Can also be used to exploit known weaknesses - for example, OllyDbg had known bug of not correct handling of format strings and crashed with multiple “%s” input.</li>
<li>seems this list can be extended ad infinitum…</li>
</ul>
<h3>2) Flags</h3>
<ul>
<li><em>Trap flag</em> - controls tracing of a program. If it’s set, executing an instruction will raise SINGLE_STEP exception. Example of usage: pushf / mov dword [esp], 0x100 / popf. Another possible scenario might be tracing over SS (stack segment register) - debugger will not break on those (e.g. push ss / pop ss) effectively stopping on the following instruction. In other words, unset of trapflag won’t be possible after that, and if check is done here, debugger will be detected.</li>
<li><em>IsDebugged</em> - second byte of PEB - this is what checked by IsDebuggerPresent(), however, can also be checked directly.</li>
<li><em>NtGlobalFlag</em> - another field in PEB with offset 0x68/0xBC (x86/x64). A process that is created by debugger will have 0x70 value (FLG_HEAP_ENABLE_TAIL_CHECK | FLG_HEAP_ENABLE_FREE_CHECK | FLG_HEAP_VALIDATE_PARAMETERS) by default</li>
<li><em>Heap flags</em> - check of two flags located in heap: “Flags” and “ForceFlags”. Normally heap location can be retrieved by GetProcessHeap() and/or from PEB structure. Exact combination of flags depend on the OS (see more in details following links at the bottom)</li>
</ul>
<h3>3) Timing</h3>
<ul>
<li><em>GetTickCount, GetLocalTime, GetSystemTime, timeGetTime, NtQueryPerformanceCounter</em> - typical timing functions which are used to measure time needed to execute some function / instruction set. If difference is more than fixed threshold, the process exits.</li>
<li><em>rdtsc</em> - “Read Time Stamp Counter” asm instruction,technique is the same as described above</li>
</ul>
<h3>4) Checksums</h3>
<p>This method is based on calculation of CRC32 for certain blocks or whole binary and comparing to hardcoded value. If values differ, it indicates dynamic code changes were made (breakpoints/patches), and the process usually exits.</p>
<h3>5) Self-debug</h3>
<p>There are different approaches for this, probably the most recongnized one is to create a new process and call DebugActiveProcess(pid) on the parent process. If the process is already being debugged, associated syscall ZwDebugActiveProcess() will fail, making it clear something is wrong :)</p>
<h3>6) Rogue instructions</h3>
<ul>
<li><em>INT3</em> - classic example (0xCC, 0xCD+0x03). Checks may include comparison to xor’ed value, e.g. to 0x99 (0xCC ^ 0x55)</li>
<li><em>Single-step</em> - old trick to insert 0xF1 opcode to exploit SoftICE debugging process by generating SINGLE_STEP exception.</li>
<li><em>INT 2Dh</em> - powerful interrupt technique which results in raising breakpoint exception if the process is not debugged and in normal execution if debugger is present.</li>
<li><em>Stack Segment register</em> - already described in “Trap flag” section - due to incorrect execution of SS registers, it is possible to trick the debugger setting the flag and check its value immediately.</li>
</ul>
<h3>7) Bonus:</h3>
<p> The best protection against debugging so far seems to be own virtual machine.
Effectively, part of object code is converted to self bytecode format, which is run on a self-written VM.
The only way to properly debug such code will be emulator/disassembler for custom VM instruction format.</p>
<p><br></p>
<hr />
<p>More to read:<br>
<a href="http://pferrie.host22.com/papers/antidebug.pdf">http://pferrie.host22.com/papers/antidebug.pdf</a> <br>
<a href="http://thelegendofrandom.com/blog/archives/2100">http://thelegendofrandom.com/blog/archives/2100</a> <br>
<a href="http://www.symantec.com/connect/articles/windows-anti-debug-reference">http://www.symantec.com/connect/articles/windows-anti-debug-reference</a> <br>
<a href="http://spareclockcycles.org/2012/02/14/stack-necromancy-defeating-debuggers-by-raising-the-dead/">http://spareclockcycles.org/2012/02/14/stack-necromancy-defeating-debuggers-by-raising-the-dead/</a></p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA["Cryptosocial Network" From the Inside]]></title>
<link href="http://antukh.com/blog/2015/01/17/cryptosocial-network-from-the-inside/"/>
<updated>2015-01-17T21:45:57+01:00</updated>
<id>http://antukh.com/blog/2015/01/17/cryptosocial-network-from-the-inside</id>
<content type="html"><![CDATA[<p><em>Disclaimer: all vulnerabilities described here were reported to developers and published with their consent</em></p>
<p>“Get a public key, safely, starting just with someone’s social media username(s).” - this is what you likely to see if you visit the main page of an ambitious project named <a href="https://keybase.io">Keybase</a>.
A great idea to (finally) bring public-key cryptography en masse and make its use easy and fun.
The project is in fact a public key directory wrapped by well-worked model of social networking and tightly bound to those networks itself.</p>
<p><img class="center" src="http://antukh.com/images/0_header_maria.jpg" width="600" title="image" alt="images"></p>
<!--more-->
<p>How does it work? From the high level, a user registers, uploads a key to the server (or creates a key pair right on the website - this is what will likely to be a common scenario) and then verifies his or her identity via popular social networks and personal websites by placing signed proofs there.
Then, when by some reasons the key is not valid anymore, the user has to upload a new one and get verified again. In order to make such mechanism more efficient, creators implemented “tracking” - kind of following a user and receiving all the updates regarding his activity.</p>
<h3>From cryptographical point, authors realized an interesting concept of “TripleSec” - conjunction of three-in-one sound crypto algorithms to make the protected data storage even more secure.</h3>
<p>“TripleSec is a simple, triple-paranoid, symmetric encryption library for Python, Node.js, Go, C#, and the browser. It encrypts data with Salsa 20, AES, and Twofish, so that a compromise of one or two of the ciphers will not expose the secret. Of course, encryption is only part of the story. TripleSec also: derives keys with scrypt to defend against password-cracking and rainbow tables; authenticates with HMAC to protect against adaptive chosen-ciphertext attacks; and in the JavaScript version supplements the native entropy sources for fear they are weak.” (<a href="https://keybase.io/triplesec/">link</a>)</p>
<h3>Currently the project is in its beta - only invited people are allowed to join and test it.</h3>
<p>So that was double interesting for me to be one of them… Well, and grab a nice nickname ;)
Surprisingly, check of tweets marked “keybase” showed quite a number of people ready to share the invite - in less than 10 mins I got one and started crawling around.</p>
<p>I’m not going to publish all the issues which currently exist there - some of them (seem) non-critical now, some XSS (seem) non-exploitable, and let’s be honest to ourselves,</p>
<p> “<em>Lookup failed in query SELECT hash,val,ctime,type FROM merkle_blocks WHERE hash LIKE ? w/ ["bad%\”]</em>“ is not enough why would anyone ever write a blog post about it :)</p>
<p>After a short overview, I spotted two major design problems which looked quite serious to me.</p>
<h3>1) User password as a single point of failure</h3>
<p>There were enough articles on the Internet criticizing upload of private key to the server - I’m not going to repeat it here. But what’s really worrying here is the fact how it is presented to the end users:</p>
<ul>
<li>Most likely, “normal” people will not be using Terminal and will work in the browser - that’s pretty much fine here and that’s what creators were counting for.</li>
<li>Most likely, “normal” people will generate the key in their browser - in order to do so, they will have to confirm their “passphrase”, which for instance is their password. Sure, no problem.</li>
<li>Most likely, “normal” people will follow instructions and save the encrypted private key on the server. (Presumably) still ok.</li>
</ul>
<p><img class="center" src="http://antukh.com/images/0_keybase_private.png" width="500" height="350" title="image" alt="images"></p>
<p>The user clicks “Done” and he’s finished and ready to tweet everybody how cool it is to dive in the world of cryptography.
Did you actually spot the weak point? <em>The user was never asked to set the passphrase, which by default is the same as his password!</em>
I mean, yeah, having several layers of properly implemented encryption might help to withstand attacks on one of algorithms for quite some time.
But in fact, <em>all</em> of those assumptions are made on the fact the user password is not leaked. If that password by any means is obtained by an attacker - the game is pretty much over.
An attacker will not only have access to the social network account (which is bad but criticality here is not that huge) but also to the stored secret key which could have been used in quite a lot of other applications.
No “AnyNumSec” algorithms would keep the user safe from a single stolen account password (DNS? Phishing? Social engineering?) given the key is the same, which is the case. My suggestion here is at least to segregate account password and private key passphrase and/or put additional controls in place.</p>
<h3>2) Golden “backdoorish” session</h3>
<p>Another major concern here is about “golden key”. The structure of almost all the emails sent from the server - i.e. reminders to verify social network account, changed e-mail, password reminders, post-registration and other services, contain a small footer with links - “Change Mail Settings” and “1-Click Unsubscribe”.</p>
<p><img class="center" src="http://antukh.com/images/0_keybase_change_settings.png" width="600" title="image" alt="images"></p>
<p>Both of them look similar and have parameter “a” included:</p>
<p><em><a href="https://keybase.io/_/user/account?a=lgGS">https://keybase.io/_/user/account?a=lgGS</a>[base64-encoded id, email and several hashes]</em>.</p>
<p>This parameter is responsible for a permanent user session - it is a replacement for the session token itself, and it doesn’t expire unless the e-mail is not changed.
Just to clarify: persistent session (= persistent access to user’s account) is transferred in plaintext to the provided e-mail and stored in plaintext on the server.
Consider forwarding any of those service e-mail by unaware user, unauthorized access to the mailbox or simple typo…
Ok, what one could do with it?
Well, it is possible to login under another account but victim still controls it… Or not?</p>
<h3>It appeared that due to insufficient authorization (client-side checks only) it is possible to fully take control over victim’s account following the steps:</h3>
<p>0) Copy victim’s encrypted private key. This step is not mandatory to achieve the goal but still good-to-have one ;)</p>
<p>1) Once logged-in, an attacker is able to remove private/public keys without any confirmation. At this moment all the public proofs are gone.</p>
<p>2) It is also possible to generate a new key pair with chosen passphrase - this is where auth bypass matters.
Normally, to be able to generate a new pair, one should know current password.
An attacker can simply spoof the response and make it look valid - and that would be enough for him to proceed.
Two responses (valid/invalid) for the same functionality of checking password are presented on the screenshot below.</p>
<p><img class="center" src="http://antukh.com/images/0_keybase_possible_responses.png" width="800" height="350" title="image" alt="images"></p>
<p>3) From this point, an attacker can exploit the fact passphrase and password are used interchangeably in the application - knowing correct passphrase (which an attacker has just set - see p.2), he is able to change <em>both</em> the password and the passphrase to any string.</p>
<p>4) Finally, he can also change the e-mail now, making the “golden session” futile - and so the chances for a good guy to take his account back.</p>
<p>In such scenario an attacker doesn’t break confidentiality and doesn’t reveal the secret key (probably only saved an encrypted copy for good).
However, from the point of second part of “cryptosocialness”, losing an account in a social network is still not the most pleasant thing, especially if you already got it set up and working for some time.</p>
<h3>As a conclusion:</h3>
<p>there are certain design defects, however, it doesn’t change the fact of how brilliant the idea itself is.
Besides, it is beta now, so this is exactly the time when good guys could make possibly-next-big-thing safer.
If you want to get an invite, I still have a couple, so drop me a line… And if you’re already registered, feel free to <a href="https://keybase.io/my">track me</a> ;)</p>
]]></content>
</entry>
</feed>