RECONSTRUCTION OF APPLICATION LAYER MESSAGE SEQUENCES BY NETWORK MONITORING Jaspal Subhlok University of Houston Houston, TX Amitoj Singh Fermi National Accelerator Laboratory Batavia, IL.
Download ReportTranscript RECONSTRUCTION OF APPLICATION LAYER MESSAGE SEQUENCES BY NETWORK MONITORING Jaspal Subhlok University of Houston Houston, TX Amitoj Singh Fermi National Accelerator Laboratory Batavia, IL.
RECONSTRUCTION OF APPLICATION LAYER MESSAGE SEQUENCES BY NETWORK MONITORING Jaspal Subhlok University of Houston Houston, TX Amitoj Singh Fermi National Accelerator Laboratory Batavia, IL 1 Introduction • Reconstruct Application layer message sequences by analyzing Transport layer traffic. TCP segments exchanged N1 N2 messages sent messages recvd N3 2 Purpose (why bother ?) • Application message exchange pattern is a fundamental program property – e.g., determines application performance in different conditions • Network traffic due to an application can be monitored non-intrusively, but.. discovering application message sequence is hard – need access to source code or a profiling library Hence this method to construct application messages from TCP monitoring 3 Particular Motivation Data Model Sim 2 Pre Stream Vis Sim 1 Application ? Network size and pattern of message exchanges is a key component of an application profile used to select good network nodes to execute on 4 Key Principle • An application message is typically fragmented into a consecutive sequence of TCP segments where all except the last segment is of size MSS (Maximum Segment Size). Application Layer 1 unit = MSS TCP layer Application message TCP segment Last TCP segment 5 Message Reconstruction Procedure Phases 1. Separate TCP streams. 2. Sanitize a TCP stream. 3. Reconstruct application layer messages. 4. Error minimization by “best-of-three” technique. 6 Separating TCP streams • A communication link transports multiple TCP streams • A TCP stream spans a unique series of sequence numbers MSS = 1448 bytes 1 : 1448 1449 : 2896 2897 : 4344 431376 : 432823 4345 : 5792 5793 : 7240 432824 : 433610 7241 : 8688 8689 : 10136 431376 : 432823 432824 : 433610 1 : 1448 1449 : 2896 2897 : 4344 4345 : 5792 5793 : 7240 7241 : 8688 8689 : 10136 Separate red and black streams of TCP Segments (not fool 7 proof but adequate) Sanitizing TCP streams • Insert TCP segments not recorded (assume it is rare) • Filter out retransmissions Missing TCP segment is inserted 1 : 1448 1449 : 2896 2897 : 4344 4345 : 5792 7241 : 8688 8689 : 10136 10137 : 11584 1448 bytes 10137 : 11584 11585 : 13032 997 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 4345 : 5792 1448 bytes 5793 : 7240 1448 bytes 7241 : 8683 Missing TCP segment Duplicate TCP segment is removed. 8 Reconstruct application messages • A TCP segment of size smaller than MSS (=1448) indicates the end of an application message. TCP segments 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 1448 bytes 997 bytes 1448 bytes 800 bytes 1 : 1448 + 1449 : 2896 + 2897 : 4344 + 4345 : 5792 5793 : 7240 + + 7241 : 8688 + 8689 : 10136 + 10137 : 11584 11585 : 12574 + Application messages 12,574 bytes End of Message Start of Message 12575 : 14022 + 14023 : 15022 2,248 bytes 9 Best-of-three • Reconstruction heuristic is not perfect 1. A TCP segment smaller than MSS may be sent before the entire application message is finished. 2. Two short application messages may be packed into the same TCP segment. 1. 2. Application message TCP segment 10 Best-of-three Basic idea:reconstruction heuristic is unlikely to fail in exactly the same way in multiple identical runs Solution: make 3 runs and select the majority view at every stage A A A A+B B B B C C C D D D Run 2 Run 3 C+D Run 1 Correct Message Sequence 11 Experimental setup • NAS parallel benchmark suite programs run on a cluster of 4 workstations • tcpdump utility used to capture TCP segments • The reconstructed application layer message sequence compared with the true sequence obtained with profiling 12 Message Reconstruction Results 100% 90% 80% 70% 60% NO MATCH 50% APPROX MATCH 40% EXACT MATCH 30% 20% 10% 0% BT CG IS LU MG SP NAS Benchmark • APPROX MATCH: Includes reconstructed messages off by upto 100 bytes AND/OR combined with one other application message. • Perfect for large messages (IS), Approx for small (LU) 13 Conclusions • Majority of messages reconstructed accurately, almost all detected approximately • Accuracy low for large number of small messages • Procedure based entirely on network measurements, hence can be applied to any code • Accuracy sufficient for resource selection in Network/Grid environments. 14 Dominant communication pattern of the NAS benchmarks 15 Experimental Setup 100 Mbps Ethernet switch 500 MHz dual processor Pentium Linux workstations. tcpdump – capturing outgoing TCP packets. 16