Nova Scheduler - OpenStack中国社区
Download
Report
Transcript Nova Scheduler - OpenStack中国社区
Nova Scheduler
Shane Wang(王庆), Intel Open Source Technology Center
微信号:qq559382
Agenda
What is current situation?
How scheduler works in Juno and Kilo
Resource Tracking
Filters and Weight
Utilization Based Scheduling (UBS)
What is next plan?
Gantt
Dynamic Resource Scheduling (DRS)
How scheduler works in Juno and Kilo
Scheduler
3. Request host that match the
request_spec and filter_properties
4. Returns selected hosts
1. User request and
with scheduler hints to
include scheduling
policy
6. Rescheduling after claim
resource failed or other failure
2. Submit new task
API
Conductor
Compute
5. Call the selected compute
Resource usage Tracking
1) Fetch newest compute node stats for each call
2) Filter and weight the host
3) Consuming the resource for selected host
Scheduler
3. Request host that match the
request_spec and filter_properties
Resource Claiming
1) Validate the resource usage
2) Update the resource Usage
3) Update to DB
4. Returns selected hosts
6. Rescheduling after claim
resource failed or other failure
2. Submit new task
Conductor
Compute
5. Call the selected compute
Hypervisor
Hypervisor
Hypervisor
Periodically update the node resource with 60 seconds interval
1) Get hypervisor resource
2) Consuming the resource
3) Update to DB
DB
Filters and weight hosts
scheduler_available_filters='nova.scheduler.filters.all_filters‘
scheduler_default_filters= [……]
scheduler_weight_classes=nova.scheduler.weights.all_weighers
Request Spec:
Image
Instance_properties
Instance_type
scheduler_host_subset_size=1
Filter_properties
Scheduler-hints
Assist parameter: retry
Nova boot –flavor 1 –image …… --hint group=‘sg1’
--hint <key=value>
Send arbitrary key/value pairs to the
scheduler for custom use.
Filters
Resource:
CoreFilter AggregateCoreFilter: cpu_allocation_ratio=16.0
RamFilter AggregateRamFilter: ram_allocation_ratio=1.5
DiskFilter AggregateDiskFilter: disk_allocation_ratio=1.0
IoOpsFilter AggregateIoOpsFilter: max_io_ops_per_host=8. IoOps means resize, building,
image snaphsot. Migration, rescues, unshelve, backup
PciPassthroughFilter: Generic PCI device or SRIOV assignment
NUMATopologyFilter: NUMA in J, CPUPinning, Hugepage in K
Filters
Affinity:
DifferentHostFilter, SameHostFilter: scheduler_hints: different_host/ same_host =[‘instance
uuid’…]
ServerGroupAffnityFilter, ServerGroupAntiAffinityFilter:
nova server-group-create
Create a new server group with the specified details.
nova server-group-delete
Delete specific server group(s).
nova server-group-get
Get a specific server group.
nova server-group-list
Print a list of all server groups.
boot with scheduler-hints: group=uuid Boot new instance into server group
SimpleCIDRAffinityFilter: scheduler_hints: cidr, build_near_host_ip
TypeAffinityFilter, AggregateTypeAffinityFilter: instance_type
Filters
Topology:
AggregateImagePropertiesIsolation: image properties matchs aggregate metadata
IsolatedHostsFilter: isolated_hosts, isolated_images,
restrict_isolated_hosts_to_isolated_images
AggregateInstanceExtraSpecsFilter: Flavor’s extra spec match aggregate metadata
AggregateMultiTenancyIsolation: filter_tenant_id
AvailabilityZoneFilter
Filters
Others:
ComputeCapabilitiesFilter: work with instance type extra_spec: ‘capabilities:’
ComputeFilter: The compute node is live or disabled
ImagePropertiesFilter: architecture, hypervisor type, vm_mode, hypervisor_version_requires
JsonFilter: scheduler_hints:query
NumInstancesFilter, AggregateNumInstancesFilter, max_instances_per_host
RetryFilter
TrustedFilter
Weight
IoOpsWeigher
MetricsWeigher
RAMWeigher
Utilization Based Scheduling
•
•
•
•
CPU Utilization data
Memory Utilization data
Network Bandwidth data
etc
Utilization Based Scheduling
1) Fetch newest compute node stats for each call
2) Filter and weight the host
3) Consuming the resource for selected host
Scheduler
3. Request host that match the
request_spec and filter_properties
Update 60 seconds interval
4. Returns selected hosts
CPU Monitor
6. Rescheduling after claim
resource failed or other failure
2. Submit new task
Conductor
Compute
5. Call the selected compute
NetworkBand
Width
MemoryCache
Monitor
DB
Notification Bus
AMQP
Hypervisor
Hypervisor
Hypervisor
Utilization Based Scheduling
MetricsWeigher:
weight_multiplier: Multiplier used for weighing metrics.
weight_setting: How the metrics are going to be weighed.
Required: If true, use the MetricsFilter
weight_of_unavailable
How scheduler strategy affects performance?
Benchmark Accuracy
Smart Scheduling
Efficiency
QoS meet SLA contract
What is monitored now?
Nova
not easy to add
how to use?
OpenStack Service
Type
Static capabilities
Metrics (e.g.)
• CPU features
• hypervisor version
Dynamic Resources
•
•
•
•
free memory/disk
vCPU #
PCI devices
# of NIC virtual functions
Resources
creation/deletion
•
•
•
•
VM
network/subnet/port
image
……
Resources usage data
•
•
•
•
•
CPU usage in VM
memory usage in VM
network usage in VM
storage usage stats
……
Nova
Not
Enough
Ceilometer
• CPU usage stats of host
• Network usage stats of host
• Intel Node Manager Power data
• Cache Qos Monitoring(CQM) data
……
Ceilometer
no hardware pollsters
What are missing?
Policy management
Break policy into QoS parameter
Mapping QoS parameter to metrics
Actions
Live migration
Resource reallocation
Enforcement
… …
Knowledge model to evaluate complex policy situations(e.g. predict future
VM workload)
Dynamic Resource Scheduling
Existing components
To be implemented
Knowledge
model
Policy
API
Pluggable Executors
admins
Logging
Alarming
Evaluator
Evaluating
Parser
Analyzer
Historic
metrics
data
Pluggable Collectors
Other collectors
alarm
trigger
set
alarm
Enforcement
Ceilometer
collector
API
Ceilometer
Nova
Live migration
De-virtualizing
Nova collector
API
resource
reallocation
Other agents
Benchmarking
API
Other actions
Next: Gantt
Scheduler-as-a-Service project
Split from Nova first, then for other projects
Plan to split begin from L
Gantt in Kilo: Refactor, Refactor,
Refactor….
The Scheduler before Juno
Scheduler
API
Compute
Scheduler
The scheduler in Kilo
3. Request host that match the
request_spec and filter_properties
Scheduler API:
select_destinations
update_resource_stats
4. Returns selected hosts
1. User request and
with scheduler hints to
include scheduling
policy
6. Rescheduling after claim
resource failed or other failure
2. Submit new task
API
Conductor
Compute
5. Call the selected compute
Refactor
https://blueprints.launchpad.net/nova/+spec/make-resource-tracker-use-objects
https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode
https://blueprints.launchpad.net/nova/+spec/resource-objects
https://blueprints.launchpad.net/nova/+spec/request-spec-object
https://blueprints.launchpad.net/nova/+spec/sched-select-destinations-use-request-specobject
https://blueprints.launchpad.net/nova/+spec/isolate-scheduler-db
Thanks
Backup
The problem of current Nova scheduler
Server Group
Can’t add/remove active server to/from server-group
https://review.openstack.org/136487
https://review.openstack.org/139272
With affinity policy means you can’t evacuate
Ignore down host when populate the instance: https://review.openstack.org/#/c/135607/
Remove the instance from server group: https://review.openstack.org/136487, but won’t land in K, maybe L. It also won’t work for
something automatic HA
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/soft-affinity-for-server-group,n,z
Anti-affinity policy race problem, may trigger extra rescheduling
Race for migration
Support unshelve, rebuild, live-migration, migration, resize in K….but not resolve the anti-affinity policy problem.
Unshelve: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1400015,n,z
Rebuild: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:rebuild_schedule,n,z
Migration/live-migration on going…
The problem of current Nova scheduler
Missing resource claiming and retry for migration
Unshelve: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1400015,n,z
Rebuild: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:rebuild_schedule,n,z
Migration/live-migration on going…
Scheduling-hints can’t persist
You only can specific your scheduling policy at the beginning
Violate the policy after migration
https://review.openstack.org/88983 block in K, maybe L
Race Problem
the bug link https://bugs.launchpad.net/nova/+bug/1341420
scheduler_host_subset_size=N
Ironic integration
https://bugs.launchpad.net/nova/+bug/1402658
Any more problem for scheduler?
Only do initial placement!
Each project have own scheduler
DRS in Openstack
Gantt
Tetris https://docs.google.com/document/d/1DMsnGxQ3POwZCF3uxaUeEFaKX8LqUqmmgQ_7EVK7Y8/edit
Purview(Tetris) will provide framework to quickly implement and enforce different kinds of policies. Policies can be different types. Here
are a few examples of policies in clouds: Availability Policies, Performance Policies, Load balancing Policy, User Defined Policy.
Congress https://wiki.openstack.org/wiki/Congress
Congress is a policy-based management framework for the cloud. It is designed to work with any cloud software that reasonably fits
within the relational data model. It automatically prevents policy violations when possible and corrects them when not, and it
enables administrators to control the extent to which enforcement is automatic
Tetris is domain-specific policy system
Congress is domain-independent policy system
domain-independent and domain-specific policy systems are highly complementary