PCI Express In Depth For Windows Vista And Beyond Allen Marshall Lead Program Manager Vinod Mamtani Software Development Engineer Core Platform Architecture Microsoft Corporation.
Download ReportTranscript PCI Express In Depth For Windows Vista And Beyond Allen Marshall Lead Program Manager Vinod Mamtani Software Development Engineer Core Platform Architecture Microsoft Corporation.
PCI Express In Depth For Windows Vista And Beyond Allen Marshall Lead Program Manager Vinod Mamtani Software Development Engineer Core Platform Architecture Microsoft Corporation Agenda PCI Express support in Windows Vista PCI Express firmware support Enabling native PCI Express support Enabling flexible resource assignment Optimal PCI device resources Active state power management MSI support PCI Express Features Supported in Windows Vista Memory mapped CFG space access, extended CFG space access, and segment support Active State Power Management (ASPM) Native Power Management Events (PME) Message-Signaled Interrupts (MSI/MSI-X) Native Hot Plug Advanced Error Reporting (AER) Multilevel resource rebalance PCI Express Features Supported in Windows Vista Miscellaneous base features, including Capability version field checking PCI Express hardware ID and New compatible device ID identification and matching Updated device class code parsing Phantom functions Device serial numbers PCI Express tree hierarchy checking Setting of Max Payload Size and Max Request Size fields in the Device Control register to match root port settings Transactions pending support Clock power management (CLK_REQ) PCI Express Features Not supported Virtual channel For other than channel 0 Isochronous transfers Slot power budgeting Vista will not change the BIOS configuration for a device Vista will save and restore the configuration across sleep transitions PCI Express Features Non-snoop I/O Windows Vista will disable non-snoop I/O by default Windows Vista DMA model assumes cache coherent DMA Except for devices on VGA path Windows Vista clears the Enable Non Snoop bit in the Device Control register Device drivers may enable this bit Update this register during Start handling Windows Vista will thereafter preserve this value across power state transitions PCI Driver Updates Architectural changes in PCI Multilevel resource rebalance Reduced I/O space support Optimal default PCI bridge resource window sizes Subtractive decode PCI bridges Support for 64-bit resources Greater integration of PCMCIA support in PCI Re-structure of legacy R2 card interrupt detection PCI Express Firmware Enabling native PCI Express support By default, Windows Vista starts in PCI compatibility mode No PCI Express features assumed or enabled _OSC is used by firmware and the operating system to Report OS capabilities to the platform Report platform capabilities to the OS Transfer control of PCI Express features from firmware to the operating system PCI Express Firmware Reporting Windows Vista capabilities Windows Vista reports support for the following capabilities to the firmware via _OSC method Extended PCI config space Message-signaled interrupts Clock power management PCI segment groups PCI Express Firmware Negotiating control of native features Dependencies exist between PCI Express features Windows Vista requires the platform to grant control to the OS over all of AER Native PME Hot plug Express capability Otherwise, Windows Vista will not assume control of any of these features PCI Express Firmware Negotiating control of native features Control negotiated via _OSC method When granting control of native features, firmware should grant control of unimplemented features This signals Windows Vista that it is safe to assume control of the implemented features PCI Device Resources Address space constraints Physical address space below 4GB continues to face increased demand MCFG introduces large memory hole Larger amounts of system RAM Greater numbers of PCI or PCI Express devices Increasing device resource requirements Devices limited to 32-bit DMA support PCI Device Resources Address space constraints This problem is mitigated by PCI Express 64-bit prefetchable BARs Required by WHQL logo program Allows Windows to assign resources above 4 GB Emergence of mainstream 64-bit platforms PCI Device Resources PCI Resource Arbitration Windows configures and starts a PCI bridge before scanning the secondary side of the bridge for PCI devices All devices on the bridge are arbitrated with resources that fall inside the bridge’s resource window Subtractive bridges that also do positive decoding have resources arbitrated from the bridge window Legacy and non-PCI devices will be arbitrated with resources outside bridge window PCI Device Resources Bridge window configuration Windows XP and Windows Server 2003 do not reconfigure the bridge windows based on the requirements of a device behind the bridge May lead to a PCI device not starting due to lack of resources Even though enough device resources are available to the system In some cases, boot configuration of PCI devices by firmware works best for Windows versions prior to Windows Vista Mobile PCs that don’t expose PCI expansion slots Server PCs that support device hot plug Large server configurations with extensive I/O Platform has better visibility into specific resource requirements than the OS can ascertain during boot PCI Device Resources Bridge window configuration Windows Vista supports multi-level resource rebalance Allows Vista to dynamically reconfigure resource assignments across multiple hierarchical levels in a device tree Windows Vista default bridge resource windows sizes are optimized for deep PCI Express hierarchies I/O windows default to 4 K Memory windows default to 1 MB If a PCI device’s resource requirement cannot be arbitrated inside the current bridge resource window, Vista reconfigures the PCI bridge with a new set of resources to accommodate the PCI device requirements Avoiding boot configuration of all PCI devices works best on Vista Platform must boot configure required boot devices If device requiring boot configuration are behind a bridge, you must boot config all devices behind the bridge PCI Device Resources Properly define MMCFG space Windows Vista parses MCFG table for memory mapped config access This memory is marked off-limits for device resource assignment Bus number range must match bus range for PCI root bus Earlier versions of Windows should have an ACPI motherboard resource which claims the exact same MM config region Place motherboard resource at correct location in namespace PCI Device Resources Properly define MMCFG space _SEG method must be defined for PCI Root Bus that matches segment in MCFG Segment number encoded in bus number range for module devices Avoid common pitfalls Define memory region accurately, don’t overlap other devices (like local APIC)! Ensure regions are the same for Windows Vista and earlier versions of Windows Implement SAL revision >= 3.2 for Itanium platforms for extended config access PCI Device Resources Device resources above 4 GB Devices with boot configurations above 4 GB are handled differently across Windows versions Windows Vista always respects boot configuration of devices above 4 GB If the processor and operating system version support accessing addresses > 4 GB Windows XP and Window Server 2003 ignores boot configurations above 4 GB If resources cannot be allocated below 4 GB, a range above 4 GB will be assigned Regardless of the processor or Windows addressing capability, which may leave the device inoperable PCI Device Resources Firmware resource allocation The different behaviors of Windows versions present conflicting requirements to platform firmware This necessitates a flexible approach to firmware development See whitepaper for details on how to enable optimal resource assignment for Windows Vista and earlier versions of Windows PCI Device Resources Need ability to ignore boot config A mechanism is needed for the platform to indicate to the OS that boot configurations can be ignored for a device hierarchy Enables Windows Vista to ignore boot configured device resources Provides for greater resource allocation flexibility Allows firmware to boot configure devices for best compatibility with Windows XP and Windows Server 2003 This allows backward compatibility and smooth transition to future operating systems PCI Device Resources _DSM for Ignoring Boot Config _DSM is an optional ACPI control method that enables devices to provide device-specific control functions _DSM usage for PCI is defined in the PCI Firmware Specification, Rev. 3.0 The _DSM method for PCI devices is optional on Windows Vista and is not evaluated on Windows XP and Windows Server 2003 Microsoft has proposed an ECR to the PCI Firmware Specification to add an additional function definition to ignore boot configuration of PCI devices PCI Device Resources Platform usage of _DSM Assign root bridge resources spanning 4 GB boundary All devices in the path must support 64-bit prefetchable BARs Apply boot configurations to all devices for compatibility with earlier versions of Windows Implement _DSM allowing boot configuration to be ignored Windows Vista will place all resources above 4 GB Active State Power Management Overview ASPM is required for PCI Express devices Serial links remain active to maintain synchronization ASPM offers significant power savings ≈ 1W to 3W, depending on device or lane width This is especially important on mobile PCs Roadmap includes enabling ASPM on as many devices as possible Active State Power Management ASPM in Windows Vista Enabling ASPM in Windows Vista is based on Hardware capabilities L0s is required as per the PCI Express Base specification L1 is required for ExpressCard Exit time latencies System power policy System-level controls Device-level ASPM controls Active State Power Management ASPM hardware capabilities Hardware must be capable of ASPM as reported in its Link Capabilities register Windows Vista checks this register for all PCI Express devices in the hierarchy, including Root Ports Switch Ports PCI Express-to-PCI or -PCI-X Bridges Device Endpoints Active State Power Management ASPM exit time latencies L0s or L1 is always enabled for a Switch or Root Port when a device is present on the link To enable ASPM for endpoints Windows Vista must first calculate exit latencies To ensure that the overall hierarchy latency is within the links requirements for an endpoint Windows Vista calculates exit time latency in accordance with the PCI Express Base Specification Active State Power Management ASPM exit time latencies Windows Vista first calculates latencies separately for both L0s and L1 L0s is managed independently for both Root-facing and Endpoint-facing Links Windows Vista computes overall latency starting at the Root Port, progressing to the Endpoint, and then returning from the Endpoint back up to the Root Port IHVs must reflect the L1 exit latency timing accurately in the Link Capabilities register So that Windows Vista can calculate exit latencies as precisely as possible Active State Power Management System power policy ASPM settings are linked to overall system power policy settings in the operating system Windows Vista power policy settings allow for negotiation of these ASPM states Off (L0) Moderate Power Savings (L0s) Maximum Power Savings (L0s or L1) Active State Power Management System power policy Windows Vista default ASPM power policy settings System power policy Power sourc e High performan ce Balanced Power saver AC Off Moderate power savings (L0s) Maximum power savings (L0s/L1) DC Off Maximum power savings (L0s/L1) Maximum power savings (L0s/L1) Active State Power Management System-level controls Windows Vista enables ASPM based on two mechanisms The version of PCI Express Base Specification with which PCI Express devices in the system comply Platform firmware override mechanisms Active State Power Management System-level controls Microsoft encountered some devices that comply with PCI Express Base Specification 1.0 but did not implement ASPM correctly Too many broken devices to enable ASPM by default on all PCI Express devices Device PCI Express Base Specification Revision Compliance determines whether ASPM is enabled by default Devices which support revision 1.1 have ASPM enabled by default Role-based Error Reporting capability bit in the Device Capabilities register used to determine 1.1 revision compliance Active State Power Management Firmware override mechanisms System BIOS may also control ASPM operation on Windows Vista The BIOS may enable ASPM on pre-1.1 devices via boot configuration Windows Vista may override this setting Based on the result of the device version and device .inf file opt-in/opt-out directives The BIOS may disable ASPM system-wide Microsoft has proposed a new ACPI flag to allow firmware to convey to OSPM that ASPM should not be enabled on a non-compliant platform Active State Power Management Device level ASPM controls For systems with pre-1.1 hardware, an “opt-in” flag has been defined to allow bypassing the pre-1.1 check Skipping this check allows specific devices known to work to use ASPM A device “opt-out” mechanism allows a driver to specify that a device it controls does not properly support ASPM This targets post-1.1 devices Active State Power Management Device level ASPM controls This value can be populated by the device driver INF file at install time The machine.inf file contains a section to set the value Device INFs need include only this section by using Include and Needs directives to get the desired behavior [PciASPMOptIn] AddReg=PciASPMOptIn.RegHW [PciASPMOptIn.RegHW] HKR,e5b3b5ac-9725-4f78-963f-03dfb1d828c7,ASPMOptIn,0x10001,1 [PciASPMOptOut] AddReg=PciASPMOptOut.RegHW [PciASPMOptOut.RegHW] HKR,e5b3b5ac-9725-4f78-963f-03dfb1d828c7,ASPMOptOut,0x10001,1 Active State Power Management Device level ASPM controls Enabling ASPM on Pre-1.1 Devices The driver developer must place the following entry in the device driver’s INF file [DDInstall.HW] Include=machine.inf Needs=PciASPMOptIn Active State Power Management Device level ASPM controls Disabling ASPM The driver developer must place the following entry in the device driver’s INF file [DDInstall.HW] Include=machine.inf Needs=PciASPMOptOut Message Signaled Interrupts MSI/MSI-X overview Device drivers should opt-in to get MSI and MSI-X messages assigned to the device Device drivers must use the IoConnectInterruptEx call to attach a message service routine (MSR) for these messages Windows Vista allows 16 MSI messages and up to 2048 MSI-X messages per device. It is possible to attached separate MSRs for each message See whitepaper for detailed samples Message Signaled Interrupts When fewer messages are available PCI driver programs all assigned MSI-X messages to a device MSI-X table When a device assigned fewer than requested messages, PCI driver fills table with duplicate messages Single MSI-X message programmed for each entry All assigned messages programmed in sequence, then fill remaining set with first message Driver can change the MSI-X table entries and disable/enable them as it deems appropriate Call To Action Consider PCI resource allocation when designing firmware for platforms that are running Windows operating systems Implement and validate ASPM in your PCI Express devices Test and validate native PCI Express features and firmware with Windows Vista Implement MSI aware device drivers Additional Resources Web Resources Whitepapers http://www.microsoft.com/whdc/system/bus/pci/ default.mspx Related sessions CPA002 ACPI in Windows Vista CPA060 Kernel Plug and Play in Windows Vista PCI support Send e-mail to Microsoft PCI Express Support at pciesup @ microsoft.com Backup PCI Express Firmware ACPI _OSC method Required for host bridges that originate a PCI Express hierarchy _OSC must be present in order for Windows Vista to enable PCI Express features If _OSC is not present, Windows Vista will not assume control of any PCI Express features Essentially runs in Windows XP mode In Device Manager, PCI Express root ports will show up as PCI to PCI bridges PCI Device Resources Update PCI _DSM to ignore boot config UUID E5C937D0-3553-4d7a-9117-EA4D19C3434D Revision Function Description 1 1 PCI Express Slot Information 1 2 PCI Express Slot Number 1 3 Vendor-specific Token ID 1 4 PCI Bus Capabilities 1 5 Ignore PCI Boot Configuration PCI Device Resources Firmware recommendations Reserve non-conflicting resources above 4 GB in the _CRS method of a PCI root bus Only one PCI root bus may have resources that spans the 4 GB boundary Use a QWORD memory descriptor in the _CRS method of a PCI root bus to define a memory range This range is then available as a PCI device memory resource to the entire hierarchy that emanates from the root bus Windows Vista uses this range If the processor and operating system version allow it PCI Device Resources Firmware recommendations Assign boot configurations for PCI devices below 4 GB to provide compatibility with Windows XP and Windows Server 2003 Implement the new _DSM method to allow Windows Vista to ignore PCI device boot configurations This ensures the most flexible resource allocation on Windows Vista Active State Power Management Firmware override mechanisms Proposed IAPC_BOOT_ARCH flag BOOT_ARCH Bit length Bit offset Description If set, indicates that the motherboard supports user-visible devices on the LPC or ISA bus. User-visible devices are devices that have end-user accessible connectors (for example, LPT port), or devices for which the OS must load a device driver so that an end-user application can use a device. If clear, the OS may assume there are no such devices and that all devices in the system can be detected exclusively via industry standard device enumeration mechanisms (including the ACPI namespace). LEGACY_DEVICES 1 0 8042 1 1 If set, indicates that the motherboard contains support for a port 60 and 64 based keyboard controller, usually implemented as an 8042 or equivalent micro-controller. VGA Not Present 1 2 If set, indicates to OSPM that it must not blindly probe the VGA hardware (that responds to MMIO addresses A0000h-BFFFFh and IO ports 3B0h-3BBh and 3C0h-3DFh) that may cause machine check on this system. If clear, indicates to OSPM that it is safe to probe the VGA hardware.. MSI Not Supported 1 3 If set, indicates to OSPM that it must not enable Message Signaled Interrupts (MSI) on this platform. PCIe ASPM Controls 1 4 If set, indicates to OSPM that it must not enable ASPM on this platform. Reserved 11 5 Must be 0. © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.