Measurement and Diagnosis of Address Misconfigured P2P traffic Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic Lab for Internet and Security Technology (LIST) Northwestern.
Download ReportTranscript Measurement and Diagnosis of Address Misconfigured P2P traffic Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic Lab for Internet and Security Technology (LIST) Northwestern.
Measurement and Diagnosis of Address Misconfigured P2P traffic Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic Lab for Internet and Security Technology (LIST) Northwestern Univ. What is P2P address misconfiguration? Thousands of peers send P2P file downloading requests to a “random” target (even not in the P2P system) on the Internet Peers “random” target on the Internet Address-misconfigured P2P traffic 2 Motivations P2P file sharing accounted for > 60% of traffic in USA and > 80% in Asia P2P software DC++ has already been exploited by attackers for DoS direct gigabit “junk” data per second to a victim host from more than 150,000 peers End user perspective Involve innocent users in DDoS attacks unconsciously Anti-P2P arm-race Downloading performance ISP perspective Reduce unwanted traffic for “green” Internet Get contacted by an ISP in Canada P2P developer perspective Identify the buggy software among a large number of variances. Help design more robust P2P software 3 Outline • • • • • Motivation Passive measurement results P2PScope system design Root cause diagnosis and analysis Conclusion 4 Passive Measurement • Honeynet/honeyfarm datasets LBL NU GQ Sensor 5 /24 10 /24 4 /16 Traces 901GB 916GB 49GB Duration 47 16 26 months months days • Events: # of unique sources > 100 in 6 hours Scan traffic removal Target identification Event time window extraction 5 Measurement Results • Event characteristics: – Usually involve thousands of peers on average – Duration: A few hours to up to a month LBL NU eMule 143 416 BitTorrent 74 211 Gnutella 4 3 Soribada 6 0 Xunlei 12 0 VAgaa 1 1 6 Popularity 39%! #connections(M) • Growing Trend: 14 12 10 8 6 4 2 0 2004 2005 2006 2007 The total numbers of connections that match the P2P signatures. • IP space: observed in three sensors in five different /8 IP prefixes 7 Further Diagnosis • Problems with passive measurement on archived data – Events have gone – Hard to backtrack the propagation – Root cause? • Need a real-time backtracking and diagnosis system! 8 Outline • • • • • Motivation Passive measurement results P2PScope system design Root cause diagnosis and analysis Conclusion 9 Design of P2PScope System P2P-enabled Honeynet Backtracking system Root cause inference 10100101011101 infohash; ‘abc.avi’ P2P payload signature based responder Event identification Protocol parsing for metadata 10 Design of P2P Doctor System P2P-enabled Honeynet Backtracking system Root cause inference Server ... Server ... Server Local Crawler Server Server Index Server (tracker) Crawling BT: top 100, eMule: 185 Peer Exchange Protocol Crawling DHT Crawling 11 Design of P2P Doctor System P2P-enabled Honeynet Backtracking system Root cause inference • Track the information flow for suspicious P2P software • Track how honeynet IPs propagated in P2P systems • Peer routability checking Totally ~7000 lines of Python, Perl and Bro • Anti-P2P analysis • Hypothesis formulation and testing 12 Outline • • • • • Motivation Passive measurement results P2P Doctor system design Root cause diagnosis and analysis Conclusion 13 Diagnosis & Analysis • Questions – What is the root cause? – Which peers spread misconfiguration? – How is misconfiguration disseminated? – How badly are individual clients affected? • Results – Data plane traffic radiation – Detailed results focus on eMule and BitTorrent 14 Data Plane Traffic Radiation 1.2.3.4 Resource mapping Peer Exchange Who has avatar.avi? DHT Index Server 1.2.3.4 15 eMule – Root Cause • Byte ordering is the problem! 4.3.2.1 1.2.3.4 4.3.2.1 4.3.2.1 4.3.2.1 4.3.2.1 16 eMule – Root Cause • Byte ordering is the problem! – 61% of the reverse honeynet peers indeed running eMule with the port number reported – For the backtracked peers which is in the unroutable IP space, 69.6% of them having reverse IPs run eMule • Locate bugs in source code – At least aMule 2.1.0 (a popular eMule alternative) has the byte order bug 17 eMule – Peers & Dissemination • Which peers spread misconfiguration? – 99.24% of misconfigured peers are normal peers • How is the misconfiguration disseminated? – Index Server? No – Peer exchange? Yes – DHT? No • Percentage of bogus peers in eMule network? – [12.7%, 25.0%] w/ a total of 37,079 backtracked peers 18 BitTorrent – Root Cause I • Anti-P2P companies deliberately inject bogus peers! – 20% of traffic we observed related to anti-P2P peers – Only return bogus peers or anti-P2P peers – Using UTorrent peer exchange protocol to disseminate – Find a particular peer farm • One /24 network, each IP run hundreds of peers • Run Azureus 2.5.0.0 and IPs also run VMware • Return peers even for non-existing file hashes. 19 BitTorrent – Root Cause II • KTorrent also has a byte-order bug – Discover using information flow tracking on KTorrent, UTorrent and Azureus – Identify the actual bug, report to KTorrent Developers and get confirmed. • Misconfiguration propagation – [fully] KTorrent: all peers exchanged from others – [partial] UTorrent: all peers that respond to TCP handshaking – [almost not] Azureus: all peers that respond to BitTorrent handshaking. 20 Conclusions • The first study to measure and diagnose largescale address misconfigured P2P traffic • Find 39% Internet background radiation is caused by address misconfiguration – Popular in various P2P systems, increase 100% each year for four years, and scattered in the IPv4 space • For eMule, we found it is caused by network byte order problem • For BitTorrent – Anti-P2P companies deliberately inject bogus peers – KTorrent has a byte order bug 21 22