This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 5
/
README
1018 lines (795 loc) · 46 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
--=--
p0f 2
--=--
"Dr. Jekyll had something to Hyde"
passive OS fingerprinting tool
version 2.0.8
(C) Copyright 2000 - 2006 by Michal Zalewski <lcamtuf@coredump.cx>
Various ports (C) Copyright 2003 - 2006 by:
Michael A. Davis <mike@datanerds.net>
Kirby Kuehl <kkuehl@cisco.com>
Kevin Currie <kcurrie@cisco.com>
Portions contributed by numerous good people - see CREDITS file.
http://lcamtuf.coredump.cx/p0f.shtml
For a book on some interesting passive fingerprinting tips, see:
http://lcamtuf.coredump.cx/silence
*********************************************************************
**** HELP WITH P0F DATABASE: http://lcamtuf.coredump.cx/p0f-help ****
*********************************************************************
-----------
0. Contents
-----------
This document describes the concept and history of p0f, its
command-line options and extensions, and goes into some detail about
its operation, integration with existing solutions, and so on.
Table of contents:
1) What's this, anyway?
2) Why would I want to use it?
3) What's new then?
4) Command-line
5) Active service integration
6) SQL database integration
7) Masquerade detection
8) Fingerprinting accuracy and precision
9) Adding signatures
10) Security
11) Limitations
12) Is it better than other software?
13) Program no work!
14) Appendix A: Links to OS fingerprinting resources
-----------------------
1. What's this, anyway?
-----------------------
The passive OS fingerprinting technique is based on analyzing the
information sent by a remote host while performing usual communication
tasks - such as whenever a remote party visits your webpage, connects to
your MTA - or whenever you connect to a remote system while browsing the
web or performing other routine tasks. In contrast to active fingerprinting
(with tools such as NMAP or Queso), the process of passive fingerprinting
does not generate any additional or unusual traffic, and thus cannot be
detected.
Captured packets contain enough information to identify the remote OS,
thanks to subtle differences between TCP/IP stacks, and sometimes certain
implementation flaws that, although harmless, make certain systems quite
unique. Some additional metrics can be used to gather information about
the configuration of a remote system or even its ISP and network setup.
The name of the fingerprinting technique might be somewhat misleading -
although the act of discovery is indeed passive, p0f can be used for
active testing. It is just that you are not required to send any unusual
or undesirable traffic, and can rely what you would be getting from
the remote party anyway, in the course of everyday, seemingly innocuous
chatter.
To accomplish the job, p0f equips you with four different detection modes:
- Incoming connection fingerprinting (SYN mode, default) - whenever
you want to know what the guy or gal who connects to you runs,
- Outgoing connection (remote party) fingerprinting (SYN+ACK mode) -
to fingerprint systems you or your users connect to,
- Outgoing connection refused (remote party) fingerprinting (RST+ mode)
- to fingerprint systems that reject your traffic,
- Established connection fingerprinting (stray ACK mode) - to examine
existing sessions without any needless interference.
It is quite difficult to pinpoint who came up with this idea of passive
SYN-based OS fingerprinting, though due credit must be given to Craig Smith,
Peter Grundl, Lance Spitzner, Shok, Johan, Su1d, Savage, Fyodor and other
brave hackers who explored this and related topics in the years 1999 and
2000.
P0f was the first (and I believe remains the best) fully-fledged
implementation of a set of TCP-related passive fingerprinting techniques.
The current version uses a number of detailed metrics, often invented
specifically for p0f, and achieves a very high level of accuracy and detail;
it is designed for hands-free operation over an extended period of time,
and has a number of features to make it easy to integrate it with other
solutions.
Portions of this code are used in several IDS systems, some sniffer
software; p0f is also shipped with several operating systems and
incorporated into an interesting OpenBSD pf hack by Mike Frantzen, that
allows you to filter out or redirect traffic based on the source OS.
There is also a beta patch for Linux netfilter, courtesy of Evgeniy
Polyakov. In short, p0f is a rather well-established software at this
point.
------------------------------
2. Why would I want to use it?
------------------------------
Oh, a number of uses come to mind:
- Profiling / espionage - ran on a server, firewall, proxy or router,
p0f can be used to silently gather statistical and profiling information
about your visitors, users, or competitors. P0f also gathers netlink
and distance information suitable for determining remote network
topology, which may serve as a great piece of pre-attack intelligence.
- Active response / policy enforcement - integrated with your server
or firewall, p0f can be used to handle specific OSes in the most
suitable manner and serve most appropriate content; you may also enforce
a specific corporate OS policy, restrict SMTP connections to a set of
systems, etc; with masquerade detection capabilities, p0f can be used
to detect illegal network hook-ups and TOS violations.
- PEN-TEST - in the SYN+ACK, RST+, or stray ACK mode, or when a returning
connection can be triggered on a remote system (HTML-enabled mail with
images, ftp data connection, mail bounce, identd connection, IRC DCC
connection, etc), p0f is an invaluable tool for silent probing of a
subject of such a test.
Masquerade detection in SYN+ACK or RST+ modes can be also used to
test for load balancers and so forth.
- Network troubleshooting - RST+ mode can be used to debug network
connectivity problems you or your visitors encounter.
- Bypassing a firewall - p0f can "see thru" most NAT devices, packet
firewalls, etc. In SYN+ACK mode, it can be used for fingerprinting
over a connection allowed by the firewall, even if other types of
packets are dropped; as such, p0f is the solution when NMAP and
other active tools fail.
- Amusement value is also pretty important. Want to know what this
guy runs? Does he have a DSL, X.25 WAN hookup, or a shoddy SLIP
connection? What's Google crawlbot's uptime?
Of course, "a successful [software] tool is one that was used to do
something undreamed of by its author" ;-)
-------------------
3. What's new then?
-------------------
The original version of p0f was written somewhere in 2000 by Michal
Zalewski (that be me), and later taken over William Stearns (circa 2001).
The original author still contributed to the code from time to time, and
the version you're holding right now is his sole fault - although I'd like
William to take over further maintenance, if he's interested.
Version 2 is a complete rewrite of the original v1 code. The main reason
for this is to make signatures more flexible, and to implement certain
additional checks for very subtle packet characteristics to improve
fingerprint accuracy. Changes include:
NEW CORE CHECKS:
- Option layout and count check,
- EOL presence and trailing option data [*],
- Unrecognized option handling (TTCP, etc),
- WSS to MSS/MTU correlation checks [*],
- Zero timestamp check,
- Non-zero ACK in initial SYN [*],
- Non-zero "unused" TCP fields [*],
- Non-zero urgent pointer in SYN [*],
- Non-zero second timestamp [*],
- Zero IP ID in initial packet,
- Unusual auxiliary flags,
- Data payload in control packets [*],
- SEQ number equal to ACK number [*],
- Zero SEQ number [*],
- Non-empty IP options.
[*] denotes metrics "invented" for p0f, as far as I am concerned. Other
metrics were discussed by certain researchers before, although usually
not implemented anywhere. A detailed discussion of all checks performed
by p0f can be found in the introductory comments in p0f.fp, p0fa.fp
and p0fr.fp.
As a matter of fact, some of the metrics were so precise I managed
to find several previously unknown TCP/IP stack bugs :-) See
doc/win-memleak.txt and p0fr.fp for more information.
ENGINE IMPROVEMENTS:
- Major performance boost - no more runtime signature parsing, added
BPF pre-filtering, signature hash lookups. All this to make p0f
suitable for being run on high-throughput devices,
- Advanced masquerade detection for policy enforcement (ISPs,
corporate networks),
- Modulo and wildcard operators for certain TCP/IP parameters to make
it easier to come up with generic last chance signatures for
systems that tweak settings notoriously (think Windows),
- Auto-detection of DF-zeroing firewalls,
- Auto-detection of MSS-tweaking NAT and router devices,
- Media type detection based on MSS, with a database of common
link types,
- Origin network detection based on unusual ToS / precedence bits,
- Ability to detect and skip ECN option when examining flags,
- Better fingerprint file structure and contents - all fingerprints
are rigorously reviewed before being added.
- Generic last-chance signatures to cover general OS characteristics,
- Query mode to enable easy integration with third party software -
p0f caches recent fingerprints and answer queries for src-dst
combinations on a local stream socket in a easy to parse
form,
- Usability features: greppable output option, daemon mode, host
name resolution option, promiscuous mode switch, built-in signature
collision detector, ToS reporting, full packet dumps, pcap dump
output, etc,
- Brand new SYN+ACK, RST+ and stray ACK fingerprinting modes for silent
identifications of systems you connect to the usual way (web
browser, MTA), or even systems you cannot connect to at all;
now also with RST+ACK flag and value validator.
- Fixed WSCALE handling in general, and WSS passing on little-endian,
many other bug-fixes and improvements of the packet parser
(including some sanity checks).
- Fuzzy checks option when no precise matches are found (limited).
- VLAN support.
Sadly, this will break all compatibility with v1 signatures, but it's
well worth it.
---------------
4. Command-line
---------------
P0f is rather easy to use. There's a number of options, but you don't
need to know most of them for normal operation:
p0f [ -f file ] [ -i device ] [ -s file ] [ -o file ] [ -Q socket [ -0 ] ]
[ -w file ] [ -u user ] [ -c size ] [ -T nn ] [ -e nn ]
[ -FNODVUKAXMqxtpdlRL ] [ 'filter rule' ]
-f file - read fingerprints from file; by default, p0f reads signatures
from ./p0f.fp or /etc/p0f/p0f.fp (the latter on Unix systems
only). You can use this to load custom fingerprint data.
Specifying multiple -f values will NOT combine several signature
files together.
-i device - listen on this device; p0f defaults to whatever device libpcap
considers to be the best (and which often isn't). On some newer
systems you might be able to specify 'any' to listen on all
devices, but don't rely on this. Specifying multiple -i values
will NOT cause p0f to listen on several interfaces at once.
-s file - read packets from tcpdump snapshot; this is an alternate
mode of operation, in which p0f reads packet from pcap
data capture file, instead of a live network. Useful for
forensics (this will parse tcpdump -w output, for example).
You can use Ethereal's text2pcap to convert human-readable
packet traces to pcap files, if needed.
-w file - writes matching packets to a tcpdump snapshot, in addition to
fingerprinting; useful when it is advisable to save copies of
the actual traffic for review.
-o file - write to this logfile. This option is required for -d and
implies -t.
-Q socket - listen on a specified local stream socket (a filesystem object,
for example /var/run/p0f-sock) for queries. One can later send a
packet to this socket with p0f_query structure from p0f-query.h,
and wait for p0f_response. This is a method of integrating p0f
with active services (web server or web scripts, etc). P0f will
still continue to report signatures the usual way - but you can
use -qKU combination to suppress this. Also see -c notes.
A sample query tool (p0fq) is provided in the test/
subdirectory. There is also a trivial perl implementation of
a client available; finally, test/p0fping.c can be used to
check the status of the socket prior to queries.
NOTE: The socket will be created with permissions corresponding
to your current umask. If you want to restrict access to this
interface, use caution.
This option is currently Unix-only.
-0 (In conjunction with -Q) Treat source port 0 in queries as
a wildcard. This is useful when p0f query is constructed
from within a plugin to a program that does not provide
source port information (this holds true for some mail
filters, etc).
Note that some ambiguity is introduced: the response might
not refer to the exact connection the plugin is handling,
which may (seldom) cause misidentification of NATed hosts.
-e ms - packet capture window. On some systems (particularly on older Suns),
the default pcap capture window of 1 ms is insufficient, and p0f
may get no packets. In such a case, adjust this parameter to the
smallest value that results in reliable operation (note that this
might introduce some latency to p0f).
-c size - cache size for -Q and -M options. The default is 128, which
is sane for a system under a moderate network load. Setting it
too high will slow down p0f and may result in some -M false
positives for dial-up nodes, dual-boot systems, etc. Setting it
too low will result in cache misses for -Q option. To choose the
right value, use the number of connections on average per the
interval of time you want to cache, then pass it to p0f with -c.
P0f, when run without -q, also reports average packet ratio
on exit. You can use this to determine the optimal -c setting.
This option has no effect if you do not use -Q nor -M.
-u user - this option forces p0f to chroot to this user's home directory
after reading configuration data and binding to sockets, then to
switch to his UID, GID and supplementary groups.
This is a security feature for the paranoid - when running
p0f in daemon mode, you might want to create a new
unprivileged user with an empty home directory, and limit the
exposure when p0f is compromised. That said, should such a
compromise occur, the attacker will still have a socket he can
use for sniffing some network traffic (better than rm -rf /).
This option is Unix-only.
-N - inhibit guesswork; do not report distances and link media. With
this option, p0f logs only source IP and OS data.
-F - deploy fuzzy matching algorithm if no precise matches are
found (currently applies to TTL only). This option is not
recommended for RST+ mode.
-D - do not report OS details (just genre). This option is useful
if you don't want p0f to elaborate on OS versions and such
(combine with -N).
-U - do not display unknown signatures. Use this option if you want
to keep your log file clean and are not interested in hosts that
are not recognized.
-K - do not display known signatures. This option is useful when you
run p0f recreationally and want to spot UFOs, or in -Q or -M
modes when combined with -U to inhibit all output.
-q - be quiet - do not display banners and keep low profile.
-p - switch card to promiscuous mode; by default, p0f listens
only to packets addressed or routed thru the machine it
runs on. This setting might decrease performance, depending
on your network design and load. On switched networks,
this usually has little or no effect.
Note that promiscuous mode on IP-enabled interfaces can be
detected remotely, and is sometimes not welcome by network
administrators.
-t - add human-readable timestamps to every entry (use multiple
times to change date format, a la tcpdump).
-d - go into daemon mode (detach from current terminal and fork into
background). Requires -o.
-l - outputs data in line-per-record style (easier to grep).
-A - a semi-supported option for SYN+ACK mode. This option will cause
p0f to fingerprint systems you connect to, as opposed to systems
that connect to you (default). With this option, p0f will look
for p0fa.fp file instead of the usual p0f.fp. The usual config
is NOT SUITABLE for this mode.
The SYN+ACK signature database is sort of small at the moment,
but suitable for many uses. Feel free to contribute.
-R - a barely-supported option for RST+ mode. This option will
prompt p0f to fingerprint several different types of traffic,
most importantly "connection refused" and "timeout" messages.
This mode is similar to SYN+ACK (-A), except that the program
will now look for p0fr.fp. The usual config is NOT SUITABLE for
this mode. You may have to familiarize yourself with p0fr.fp
before using it.
-O - absolutely experimental open connection (stray ACK)
fingerprinting mode. In this mode, p0f will attempt to
indiscriminately identify OS on all packets within an already
established connection.
The only use of this mode is to perform an immediate
fingerprinting of an existing session. Because of the sheer
amount of output, you are advised against running p0f in this
mode for extended periods of time.
The program will use p0fo.fp file to read fingerprints. The
usual config is NOT SUITABLE for this mode. Do not use unless
you know what you are doing. NOTE: The p0fo.fp database is very
sparsely populated at the moment.
-r - resolve host names; this mode is MUCH slower and poses some
security risk. Do not use except for interactive runs or
low traffic situations. NOTE: the option ONLY resolves
IP address into a name, and does not perform any checks for
matching reverse DNS. Hence, the name may be spoofed - do not
rely on it without checking twice.
-C - perform collision check on signatures prior to running. This
is an essential option whenever you add new signatures to
.fp files, but is not necessary otherwise.
-L - list all network interfaces. This option is Windows-only.
-x - dump full packet contents; this option is not compatible with
-l and is intended for debugging and packet comparison only.
-X - display packet payload; rarely, control packets we examine
may carry a payload. This is a bug for the default (SYN)
and -A (SYN+ACK) modes, but is (sometimes) acceptable in
-R (RST+) mode.
-M - deploy masquerade detection algorithm. The algorithm looks over
recent (cached) hits and looks for indications of multiple
systems being behind a single gateway. This is useful on routers
and such to detect policy violations. Note that this mode is
somewhat slower due to caching and lookups. Use with caution
(or do not use at all) in modes other than default (SYN).
-T nn - masquerade detection threshold; only meaningful with -M,
sets the threshold for masquerade reporting.
-V - use verbose masquerade detection reporting. This option
describes the status of all indicators, not only an overall
value.
-v - enable support for 802.1Q VLAN tagged frames. Available on
some interfaces, on other, will result in BPF error.
The last part, 'filter rule', is a bpf-style filter expression for
incoming packets. It is very useful for excluding or including certain
networks, hosts, or specific packets, in the logfile. See man tcpdump for
more information, few examples:
'src port ftp-data'
'not dst net 10.0.0.0 mask 255.0.0.0'
'dst port 80 and ( src host 195.117.3.59 or src host 217.8.32.51 )'
The baseline rule is to select only TCP packets with SYN set, no RST, no
ACK, no FIN (SYN, ACK, no RST, no FIN for -A mode; RST, no FIN, no SYN
for -R mode; ACK, no SYN, no RST, no FIN for stray ACK mode). You cannot
make the rule any broader (without cheating ;), the optional filter
expression can only narrow it down.
You can also use a companion log report utility for p0f. Simply run
'p0frep' for help.
-----------------------------
5. Active service integration
-----------------------------
In some cases, you want to feed the p0f output to a specific application to
take certain active measures based on the operating system (handle specific
visitors differently, block some unwanted OSes, optimize the content served).
As mentioned earlier, OpenBSD users can simply use the pf OS fingerprinting
implementation, a cool functionality coded by Mike Frantzen and based on
p0f methodology and signature database. This software allows them to
redirect or block OSes any way they want. Linux netfilter users can also
check out patches by Evgeniy Polyakov to get roughly the same stuff.
In other setups, or if you do not feel like fiddling with the kernel,
you want to use the -Q option, and then query p0f by connecting to a
specific local stream socket and sending a single packet with p0f_query
struct (p0f-query.h), and receiving p0f_response. P0f, when running in -Q
mode, will cache a number of last OS matches, and when queried for a
specified host and port combination, will return what it detected.
Check test/p0fq.c for a clean example.
The query structure (p0f_query) has the following fields (all
values, addresses and port numbers are in machine's native endian):
magic - must be set to QUERY_MAGIC,
id - query ID, copied literally to the response,
type - query type (must be QTYPE_FINGERPRINT)
src_ad - source address,
dst_ad - destination address,
src_port - source port,
dst_port - destination port.
The response (p0f_response) is as follows:
magic - must be set to QUERY_MAGIC,
id - copied from the query,
type - RESP_OK, RESP_BADQUERY (error), RESP_NOMATCH (cache miss),
genre[20] - OS genre, zero length if no match,
detail[40] - OS version, zero length if no match,
dist - distance, -1 if unknown,
link[30] - link type description, zero length if unknown,
tos[30] - ToS information, zero length if unknown,
fw,nat - firewall and NAT flags, if spotted,
real - "real" OS versus userland stack,
score - masquerade score (or NO_SCORE), see next section,
mflags - exact masquerade flags (D_*), see next section.
There's also a special type of queries, where type = QTYPE_STATUS,
and subsequent fields are irrelevant (should be zero); this returns
a different structure:
magic - must be set to QUERY_MAGIC,
id - copied from the query
type - must be set to RESP_STATUS (or RESP_BADQUERY on error)
version[16] - p0f version
mode - p0f mode (ASCII character, same as in command-line options)
fp_cksum - checksum of the fingerprint file for versioning purposes
cache - cache size
packets - total number of packets analyzed
matched - total number of OSes recognized
queries - total number of queries handled
cmisses - cache misses (for cache size debugging)
uptime - process uptime in seconds
The connection is one-shot. Always send the query and recv the
response immediately after connect - p0f handles the connection in
a single thread, and you are blocking other applications (until
timeout, that is, the timeout is defined as two seconds in config.h).
As of today, there is no way to integrate p0f with other programs
as a packet-parsing library. It would be trivial to implement this,
but there are no volunteers at the moment :-)
---------------------------
6. SQL database integration
---------------------------
At the very moment, p0f does not feature built-in database connectivity,
although I am looking for a willing contributor to take care of it.
In the meantime, however, you may use p0f_db utility authored by
Nerijus Krukauskas:
http://nk.puslapiai.lt/projects/p0f_db/
Jonas Eckerman has some tools to make it easier to move p0f output
from one system to another, and then to run basic visualization:
http://whatever.frukt.org/p0f-stats.shtml
-----------------------
7. Masquerade detection
-----------------------
Masquerade detection (-M) works by looking at the following factors for
all known signatures that belong to real operating systems (and not
userland tools such as scanners):
- Differences in OS fingerprints for the same IP:
-3 if the same OS
+4 if different signature for the same OS genre
+6 if different OS genres
- NAT and firewall flags set:
+4 if NAT flags differ for the same signature
+4 if fw flags differ for the same signature
+1 per each NAT and fw flag if signatures differ (max. 4)
- Link type differences:
+4 if media type differs
- Distance differences:
+1 if host distance differs
- Timestamp scoring, if timestamps available:
-1 if timestamp delta within MAX_TIMEDIF (config.h)
+1 if timestamp delta past MAX_TIMEDIF
+2 if timestamp delta negative (!)
- Time from the previous occurrence:
/2 if more than half the cache size to the previous occurrence
The final score is reported as score * 200 / 25 (25 being the highest
score possible) and reported as a percentage.
The higher the value, the more likely the result is accurate. Since
the situation when all indicators are up is rather unrealistic, the
multiplier is 200, not 100, and you can get over 100% match ;-)
Everything above 0% should be looked at, over 20% is usually a sure
bet.
You can configure the reporting of matches by setting the threshold
to a value different than zero with -T switch. -T 10 might be a good
idea. If you're looking at a local network, you can define
DIST_EXTRASCORE to score distance differences much higher - it is
unlikely for a local LAN to shrink or grow, but it's not uncommon for
routing over the Internet to change. If you are unhappy with the
scoring algorithm and do not want to modify the sources, you can use
-V option to report the status of every masquerade indicator. In
conjunction with -l, -V can be used to grep for the precise set
of signatures you're interested in.
Every hit is prefixed with ">> ". Combine -M, -K and -U to report
masquerade hits only (but it is recommended to still dump packets
with -w to be able to examine the evidence later on). A good
example:
p0f -M -K -U -w evidence.bin -c 500 -l -V 'not src host my_ip'
A quick demo:
192.165.38.73:20908 - OpenBSD 3.0-3.4 (up: 836 hrs)
-> 217.8.32.51:80 (distance 6, link: GPRS or FreeS/WAN)
192.165.38.73:21154 - Linux 2.4/2.6 (NAT!) (up: 173 hrs)
-> 217.8.32.51:80 (distance 6, link: GPRS or FreeS/WAN)
192.165.38.73:22003 - Windows XP Pro SP1, 2000 SP3 (NAT!)
-> 217.8.32.51:80 (distance 6, link: GPRS or FreeS/WAN)
>> Masquerade at 192.165.38.73: indicators at 69%.
That was quite evident.
194.68.64.2:49030 - Windows 2000 SP2+, XP SP1
-> 217.8.32.51:80 (distance 10, link: ethernet/modem)
194.68.64.2:52942 - Windows 2000 SP4, XP SP1, patched 98
-> 217.8.32.51:80 (distance 12, link: ethernet/modem)
>> Masquerade at 194.68.64.2: indicators at 43%.
The host has a name of gateway.vlt.se, so once again, a good hit.
Verbose output looks like this:
>> Masquerade at 216.88.158.142/crawlers.looksmart.com: indicators at 26%.
Flags: OS -far
In this case, we have two different OSes (OS), but the time between two
occurrences is long enough to lower the score (-far). All -V flags are:
OS - different OS genres
VER - different OS versions
LINK - link type difference
DIST - distance differences
xNAT - NAT flags differ (same OS match)
xFW - FW flags differ (same OS match)
NAT1, NAT2 - NAT flags set (different OSes)
FW1, FW2 - FW flags set (different OSes)
FAST - timestamp delta too high
TNEG - timestamp delta negative
-time - timestamp delta within the norm
-far - distant occurrences
Because the score is cumulative, it is possible to have mutually exclusive
flags set (e.g xNAT and NAT1) whenever more than two signatures were taken
into account when calculating the score.
Masquerade status and flags can be also retrieved via the query interface,
as noted in the section above.
The functionality depends on keeping the fingerprint database clean and
prefixing non-OS fingerprints (nmap, other scanner tools,
application-induced TCP/IP stack behavior) with - prefix. Those
fingerprints, as well as all the UNKNOWNs, are not used for masquerade
detection.
Note that a single host can be reported many times. The system reports
immediately, but later on, the host might score higher once new data
arrives, and p0f will post a "correction" with a new, higher ranking.
Use the highest result for a specific host, but also observe the
consistency of subsequent results.
The solution uses a cyclic buffer also used in -Q mode (and affected by
-c parameter). You should set the value to cache not more than an
hour of traffic (and no less than a minute). Calculate the number of
connections on average per the interval of time you wish to cache,
then pass the value to p0f with -c.
Setting -c too high will result in false positives for dial-up nodes or
multiboot systems (of course, you sometimes want to detect the latter,
too). Setting it too low may miss some cases.
The code detects NAT devices that do not rewrite packets (almost
all packet firewalls). Ones that do rewrite packets (proxy firewalls)
can, on the other hand, be detected by their own signatures.
Masquerade detection will fail if all systems masqueraded have an
identical configuration and network setup, uptimes and network usage
(which is very unlikely, even in a homogeneous environment). A
prerequisite for detection is that the systems are used at (roughly) the
same time, within the cache time frame.
NOTE: The detector is most reliable and sensitive in the default (SYN) mode,
and scores are adjusted to work well there; in other fingerprinting modes,
your mileage may vary. You can try to combine -M with -A (masquerade
detection on systems you connect to), which is only really useful for
detecting load balancers and other setups that map a single address to
several servers; or with -R, which can be used both for detecting load
balancers (RST) and normal incoming masquerade detection (RST+ACK),
although it's naturally less reliable and sensitive. Using -M with -O is
weird, but regrettably not prosecuted.
----------------------------------------
8. Fingerprinting accuracy and precision
----------------------------------------
Version 2 uses some more interesting TCP/IP packet metrics, and should
be inherently more accurate and precise. We also try to use common sense
when adding and importing signatures, which should be a great
reliability boost. More obscure modes, such as RST+ or stray ACK, may
and will be inherently less accurate or reliable - see section 10 for
more details - but are still far more sane than p0f v1.
Link type identification is not particularly reliable, as some users tend
to mess with their default MTUs for better (or worse ;-) performance.
For most systems, it will be accurate, but if you see an unlikely value
reported, just deal with it.
Uptime detection is also of an amusement value. Some newly released
systems tend to multiply timestamp data by 10 or have other clocking
algorithms. The current version of p0f does not support those differences
over the entire database. I will try to fix it, until then, those boxes
would have an artificially high uptime.
NAT detection is merely an indication of MSS being tweaked at some point.
Most likely, the reason for this is indeed a NATing router, but there
are some other explanations. Linux, for example, tends to mix up MTUs
from different interfaces in certain scenarios (when, I'm not sure, but
it's common and is probably a bug), and if you see a Linux box tagged as
"NAT", it does not have to be NATed - it might simply have two network
interfaces. P0f can still be a useful NAT detection tool (you can examine
changing distances and OS matches for a specific host, too), simply don't
rely on this flag alone.
If you see link type identified as unknown-XXXX, try to Google for
"mtu XXXX". If you find something reasonable, you might want update
mtu.h and recompile p0f, and submit this information to me. Keep in
mind some MTU settings are just arbitrary and do not have to mean a
thing.
P0f also tries to recognize some less popular combinations of precedence
bits, type of service and so-called "must be zero" bit in TCP headers to
detect certain origin ISPs. Many DSL and cable operators, particularly
in Europe, tend to configure their routers in fairly unique ways in
this regard. This, again, is purely of an amusement value. See tos.h
for more information.
P0f will never be as precise as NMAP, simply because it has to rely
on what the host sends by itself, and can't check how it responds to
"invalid" or tweaked packets. On the other hand, in the times of
omnipresent personal and not quite personal firewalls and such,
p0f can often help where NMAP is confused.
Just like with any fingerprinting utility, active or passive, it is
possible to change TCP/IP stack settings to either avoid identification,
or appear as some other system - although some of the changes might
require kernel-space hacking. There are no publicly available
anti-p0f tools yet, although I expect them to appear at some point.
--------------------
9. Adding signatures
--------------------
To avoid decreasing reliability of the database, you MUST read the
information provided at the beginning of p0f.fp carefully before touching
it in any way! If you are fiddling with p0fa.fp, p0fr.fp or p0fo.fp, read
all comments in those files IN ADDITION to the contents of p0f.fp. Those
files provide a good technical primer, and document the format and
subtleties of all the fingerprints.
If you stumble upon a new signature, do consider submitting it to
lcamtuf@coredump.cx, wstearns@pobox.com, or connecting from the system to
http://lcamtuf.coredump.cx/p0f-help/. We will be happy to incorporate
this signature in the official release, and can help you make your
signature more accurate. The least popular the system is, the more valuable
the signature; we have the mainstream covered quite well.
Be sure to run p0f -C after making any additions. This will run a collision
checker and warn about shadowed or possibly incorrect signatures. This
happens more often than you'd think. The same applies to p0fa.fp, p0fr.fp
and p0fo.fp files. You need to run p0f -A -C, p0f -R -C or p0f -O -C to
verify their contents.
Rest assured, you will sooner or later find something really surprising. You
can look at tmp/ to see a current list of mysteries I've stumbled upon. The
museum at http://lcamtuf.coredump.cx/mobp/ lists some other funky cases.
By all means, I'd like to hear about other UFO sightings!
------------
10. Security
------------
Running p0f as a daemon should pose a fairly low risk, compared to tcpdump
or other elaborate packet parsers (Ettercap, Ethereal, etc). P0f does not
attempt anything stupid, such as parsing tricky high-level protocols. There
is a slight risk I screwed up something with the option parser or such, but
this code should be very easy to audit. If you do not feel too comfortable,
you can always use the -u option, which should mitigate the risk.
General security precautions for operating p0f:
- Do not make p0f setuid, setgid or otherwise privileged when the caller
isn't. Running it via sudo for users you do not trust entirely is also a
so-so idea.
- Do not use -r option unless absolutely necessary, and only for short
and supervised runs. The option introduces a bloated, potentially flawed
libc DNS handling code, and has a DoS potential.
- When running in -Q mode, you need to make sure, either by setting umask or
calling chmod/chown after launching p0f, to set correct permissions on the
query socket - that is, unless you don't see a problem with your users
querying p0f, which isn't a great threat to the humanity.
- Do not use world-writable directories for keeping the socket. Do not
use world-writable directories for output files or configuration. Come
to think about it, don't use world-writable directories for any purpose.
- Don't panic.
---------------
11. Limitations
---------------
There are several generic and some specific limitations as to what
passive fingerprinting and p0f can achieve.
Proxy firewalls and other high-level proxy devices are not transparent
to any TCP-level fingerprinting software. The device itself will be
fingerprinted, not actual source hosts. There is some software that
lets you perform application fingerprinting, this isn't it.
Some packet firewalls configured to normalize outgoing traffic (OpenBSD pf
with "scrub" enabled, for example) will, well, normalize packets. Those
signatures will not correspond to the originating system, and probably not
quite to the firewall either. Checkpoint firewall, in a fairly lame attempt
to defeat OS fingerprinting, tweaks IP ID and TTL on outgoing packets; if
you want to work around this problem, run p0f with -F option.
In default mode, in order to obtain the information required for
fingerprinting, you have to receive at least one SYN packet initiating a
TCP connection to your machine or network. Note: you don't have to respond
to this particular SYN, and it's perfectly fine to respond with RST.
For SYN+ACK fingerprinting, you must be able to connect to at least one
open port on the target machine to actually get SYN+ACK packet. You
do not need any other ports, or the ability to send awkward, multiple
or otherwise suspicious packets to the remote host (unlike with NMAP).
Also note that SYN+ACK fingerprints are somewhat affected by the initial
SYN on some systems.
If you cannot establish a connection, but the remote party at least
sends you RST+ACK back ("Connection refused"), you can use RST+ mode of
p0f (-R option), but be aware this mode is inherently less accurate
and reliable, mostly because systems usually don't bother with
putting any options in those packets, and they all look very similar.
SYN+ACK fingerprinting is considered (by me) to be less accurate and
sometimes dependent on the system that initiates the connection. Same goes
for (again, experimental!) stray ACK fingerprinting. RST+ fingerprinting
mode, on the other hand, is fairly reliable, but far less precise.
This is why I put stress on developing the SYN fingerprinting capability -
but SYN+ACK, RST+ and stray ACK database contributions and tricks are
of course very welcome.
Fingerprinting on a fully established (existing) TCP connection is now
supported by p0f (since version 2.0.5), but the database contains very
few entries, and the accuracy and applicability of this mode is not yet
well established. Be prepared for this mode to produce excessive amounts
of logs.
What I'll be trying to do is to integrate a number of fingerprinting
techniques, currently completely separate (SYN, SYN+ACK, ACK, FIN,
RST, retransmission timing, etc) into a single solution for very
high accuracy. But this is perhaps p0f 3.0.
-------------------------------------
12. Is it better than other software?
-------------------------------------
Depends on what you need. As I said before, p0f is fast, lightweight,
low-profile. It can be integrated with other services. It has a clean and
simple code, runs as a single thread and uses very little CPU power, works
on a number of systems (Linux, BSD, Solaris and probably others), has a
pretty detailed and accurate fingerprint database. Quite frankly, I
doubt there is a program that offers better overall functionality or
accuracy when it comes to passive fingerprinting, but I would not be
surprised to be proved wrong one day. In other words, feel free to
explore alternatives.
Of the ones I know... is it better than Siphon? Yes. Ettercap? Yes,
version 2 is better than v1-derived fingerprinting in Ettercap. Besides,
it's simply different, and intended for a different range of applications.
Version 1 of p0f did implement many novel fingerprinting metrics that were
later incorporated in other software, but so did version 2 - and others are
yet to catch up.
As to other "current" utilities, you can use masqdet by Wojtek Kaniewski
as an alternative to p0f -M mode. On the web, you can also stumble upon
"n0t" and "natdet" utilities authored by a guy going by the nickname
r3b00t, but these are just dumbed-down and inherently less reliable rip-offs
closely inspired on p0f code. Your mileage may vary, but I recommend
you to avoid them: they won't work any better.
--------------------
13. Program no work!
--------------------
Whoops. We apologize. P0f requires the following to compile and
run fine:
- libpcap 0.4 or newer
- GNU C Compiler 2.7.x or newer
- GNU make 3.7x or newer, or BSD make
- GNU bash / awk / grep / sed / textutils (for p0frep only)
For the Windows port requirements and instructions, please read
INSTALL.Win32 file.
Not every platform is supported by p0f, and compilation problems do
happen. Please let us know if you have any problems (or, better yet,
managed to find a solution).
If you find a system that is either not recognized, or is fingerprinted
incorrectly, please do not downplay this and let us know.
Platforms known to be working fine (regression tests not done on
a regular basis, though):
- NetBSD
- FreeBSD
- OpenBSD
- MacOS X
- Linux (2.0 and up)
- Solaris (2.6 and up)
- Windows (see INSTALL.Win32)
- AIX (you need precompiled BULL libpcap)
If p0f compiles and runs, but displays "unknown datalink" or
"bad header_len" warnings, it is likely that your network interface type
is not (yet) recognized. Let us know, it is easy to fix that once and
for all users.
----------------------------------------
14. Links to OS fingerprinting resources
----------------------------------------
Recommended RFC reading:
http://www.faqs.org/rfcs/rfc793.html - TCP/IP specification
http://www.faqs.org/rfcs/rfc1122.html - TCP/IP tutorial
http://www.faqs.org/rfcs/rfc1323.html - performance extensions
http://www.faqs.org/rfcs/rfc1644.html - T/TCP extensions
http://www.faqs.org/rfcs/rfc2018.html - TCP/IP selective ACK
Practical information:
Active ICMP fingerprinting:
http://www.sys-security.com/html/papers.html
Passive OS fingerprinting basics:
http://project.honeynet.org/papers/finger/
http://www.linuxjournal.com/article.php?sid=4750
THC Amap, application fingerprinting:
http://www.thc.org/releases.php
Hmap, web server fingerprinting:
http://wwwcsif.cs.ucdavis.edu/~leed/hmap/
Fyodor's NMAP, the active fingerprinter:
http://www.nmap.org
User-Agent information:
http://www.siteware.ch/webresources/useragents/db.html