Windows Performance Troubleshooting and

download report

Transcript Windows Performance Troubleshooting and

Windows Performance Troubleshooting and Analysis

Daniel Pearson David Solomon Expert Seminars

Daniel Pearson • Started working with Windows NT 3.51

• Three years at Digital Equipment Corporation • Supporting Intel and Alpha systems running Windows NT • Seven years at Microsoft • Senior Escalation Lead in Windows base team • Worked in the Mobile Internet sustained engineering team • Instructor for David Solomon, co-author of the Windows Internals book series

Agenda • Components of performance analysis • Understanding the tools for troubleshooting and analyzing performance issues • Troubleshooting CPU and memory issues using various Windows tools * Portions of this session are based on material developed by Mark Russinovich and David Solomon

Components of Performance Analysis • Event Tracing for Windows • Core component of the operating system • Kernel mode data structures • Used to store information about the system and system objects that can be read by various tools • e.g. dt nt!_KTHREAD KernelTime • CPU performance monitoring events • Refer to the Intel 64 and IA 32 Architectures Software Developer’s Manual •

Event Tracing for Windows • • • Built in to the system • High performance, low overhead and scalable 2.5% CPU usage for a sustained rate of 10,000 events/sec on a 2 GHz CPU 1 • Operations throughout the system that are of interest to performance are fully instrumented e.g. process and thread activity, registry I/O, disk I/O 1. Milirud, Michael. 2008. Windows Performance Analysis: Using Windows Performance Tools. Presented at Microsoft's WinHEC conference, November 5-7, Los Angeles, CA.

Event Tracing for Windows • • • • Uses a buffering and logging mechanism implemented in the kernel Per-processor buffers that are written to disk by an asynchronous writer thread Ability to enable and disable tracing dynamically Supports a managed code provider

Sysinternals Utilities

Sysinternals Utilities • Process Explorer • Useful for displaying which files, registry keys and other objects processes have open and which DLLs they have loaded • Process Monitor • Useful for showing real-time file system, registry and process & thread activity • Available for download from the TechNet site •

Resource Monitor

Resource Monitor • Included with Windows Vista and greatly enhanced in Windows 7 and Windows Server 2008 R2 • Allows the viewing of CPU, memory, disk and network resources as well as handles and modules in real time • Ability to






processes as well as to






Windows services • Useful for identifying the highest resource consumers by individual resource type, e.g. CPU • Able to list the wait chain tree of a process to determine if a process is waiting on another

Using Resource Monitor

Performance Monitor

Performance Monitor • Queries performance counters that measure system state or activity • Current values are read at specific intervals • Performance counters are included in the operating system and can be included as part of applications • Able to collect event trace data from trace providers that report actions or events • Can combine multiple trace providers into a single session • Configuration information can be collected from registry keys at a specific time or interval

Using Performance Monitor

Windows Performance Analyzer

Windows Performance Analyzer • • Part of the Windows Performance Toolkit • Support for both x86, x64, and IA64 architectures Consists of three primary programs • • xperf.exe

Used for controlling tracing and processing trace data • • xbootmgr.exe

Automates on and off state transitions and captures traces during those transitions • • xperfview.exe

A graphical trace visualization tool to represent data in the form of interactive graphs and summary tables

Windows Performance Analyzer • Primarily uses the Event Tracing for Windows infrastructure built in to the system • Can be enabled or disabled at any time without requiring a system or process restart • Supports symbol decoding, sample profiling, and recording of call stacks on kernel events • Designed to be used during automation • All the functions of the tools are available via the command line tool xperf.exe

Support for Earlier Systems • The Windows Performance Toolkit will fail to install on Windows XP and on Windows Server 2003 although data collection is supported • Copy xperf.exe and perfctrl.dll

• Trace analysis is only supported on Windows Vista and later systems

Capturing a Performance Trace • Kernel options divided into two parts • Kernel Flags • Identified by the use of uppercase characters • e.g. PROC_THREAD, LOADER, PROFILE • Kernel Groups • Indentified by the use of title case characters • e.g. Base, Diag, Latency, FileIO • Kernel Groups are made up of a collection of Kernel Flags • e.g. SysProf = PROC_THREAD+LOADER+PROFILE • Flags and groups are separated by the ‘+’ token • e.g. xperf.exe -on FileIO+DISK_IO_INIT

Merging of Performance Trace Data • Traces can be copied to another system for analysis • The trace file should be “merged” on the collection system before analysis to include additional system information • xperf -d trace.etl

System and symbol information Trace Kernel trace Merged trace XPerf

Using the Windows Performance Toolkit

Understanding CPU Activity • Windows uses 32 priority levels • The system implements a preemptive, priority driven scheduler • Priority adjustments can be applied to threads in the “dynamic” range • At least one runnable thread with the highest priority will be running 31 16 15 Real time Dynamic 0

Context Switching • • • • A switch from one thread to another is known as a context switch Switching involves saving the hardware state of a thread and restoring the state of another When a thread is scheduled, that thread’s context switch count is also incremented The context switch count represents how often a thread begins running, not how long it ran

Time Accounting Quirks • • • Looking at total CPU time for each process may not reveal where the system has spent its time • CPU time accounting is driven by an interrupt timer which is set by the Hardware Abstraction Layer Usually at either 10 or 15 msec intervals • • Thread execution and context switches that happen between clock intervals are not accounted for e.g. a thread runs and enters a wait before the clock fires Thus threads may run but never get charged

Time Accounting Prior to Windows Vista • Windows accounted for CPU time based on the interval clock timer • Thread quantum expiration was not always fair • A thread might get almost no turn • Threads were also charged for interrupts that occurred while they were running Idle Idle T1 T2 T2

Time Accounting Since Windows Vista • Windows Vista and later reads the Time Stamp Counter during every context switch • The actual CPU cycles consumed are charged to a thread • Any interrupt time is not charged to the interrupted thread • Allows for more accurate quantum accounting • A thread gets at least one turn and at most will be given one turn plus an additional tick Idle Idle T1 T1 T2

Troubleshooting High CPU Utilization

Understanding Memory Management • Windows provides two system memory pools • Nonpaged Pool and Paged Pool • Used for system wide persistent data • Prior to Windows Vista, pool sizes were a function of memory size and whether or not the system was configured as a server or a workstation • Windows Vista introduced the concept of a dynamic system address space

Dynamic System Address Space • In 32-bit Windows Vista and later, virtual memory is assigned as needed • Permits larger paged, nonpaged, and session pools • Components still cannot exceed 2 GB on 32-bit systems • On 64-bit systems, address space regions are configured to their current maximum limits for all memory sizes

Troubleshooting Memory Leaks

Additional Information • Windows Internals 5 th edition • Windows Performance Analysis Developer Center • • Windows Server Performance Team Blog • • Ask the Performance Team Blog •

Additional Information • David Solomon Expert Seminars offers training on Windows Internals both as public and private workshops and public webinars via the Internet • Currently scheduled up and coming classes • Public workshop in London, April 12 th – April 16 th • Public webinar, April 26 th & April 28 th • Public workshop in New York, May 3 rd – May 7 th • Public workshop in San Francisco, November 8 th – November 12 th • Visit

for further course descriptions and up to date information