boinc.berkeley.edu

Download Report

Transcript boinc.berkeley.edu

BOINC
The Year in Review
David P. Anderson
Space Sciences Laboratory
U.C. Berkeley
22 Oct 2009
Volunteer computing
•
Throughput is now 10 PetaFLOPS
–
•
Volunteer population is constant
–
•
mostly Folding@home
330K BOINC, 200K F@h
Volunteer computing still unknown in
–
HPC world
–
scientific computing world
–
general public
ExaFLOPS
•
Current PetaFLOPS breakdown:
•
Potential: ExaFLOPS by 2010
–
4M GPUs * 1 TFLOPS * 0.25 availability
Projects
•
No significant new academic projects
–
but signs of life in Asia
•
No new umbrella projects
•
AQUA@home: D-Wave systems
•
Several hobbyist projects
BOINC funding
•
Funded into 2011
•
New NSF proposal
Facebook apps
•
Progress thru Processors (Intel/GridRepublic)
–
Web-only registration process
–
lots of fans, not so many participants
•
BOINC Milestones
•
IBM WCG
Research
•
Host characterization
•
Scheduling policy analysis
–
•
EmBOINC: project emulator
Distributed applications
–
Volpex
•
Apps in VMs
•
Volunteer motivation study
Fundamental changes
•
App versions now have dynamically-determined
processor usage attributes (#CPUs, #GPUs)
•
Server can have multiple app versions per (app,
platform) pair
•
Client can have multiple versions per app
•
An issued job is linked to an app version
Scheduler request
•
•
Old (CPU only)
–
requested # seconds
–
current queue length
New: for each resource type (CPU, NVIDIA, ...)
–
requested # seconds
–
current high-priority queue length
–
# of idle instances
Schedule reply
•
Application versions include
–
resource usage (# CPUs, # GPUs)
–
FLOPS estimate
•
Jobs specify an app version
•
A given reply can include both CPU and GPU
jobs for a given application
Client: work fetch policy
•
•
When? From which project? How much?
Goals
–
–
–
maintain enough work
minimize scheduler requests
honor resource shares
CPU 0
CPU 1
CPU 2
CPU 3
•
min
per-project “debt”
max
Work fetch for GPUs: goals
•
Queue work separately for different resource
types
•
Resource shares apply to aggregate
Example: projects A, B have same resource share
A has CPU and GPU jobs, B has only GPU jobs
GPU
CPU
A
B
A
Work fetch for GPUs
•
For each resource type
–
per-project backoff
–
per-project debt
•
accumulate only while not backed off
•
A project’s overall debt is weighted average of
resource debts
•
Get work from project with highest overall debt
Client: job scheduling
•
GPU job scheduling
–
–
•
client allocates GPUs
GPU prefs
Multi-thread job scheduling
–
–
handle a mix of single-, multi-thread jobs
don’t overcommit CPUs
GPU odds and ends
•
•
Default install is non-service
Dealing with sporadic usability
–
•
•
e.g. Remote Desktop
Multiple non-identical GPUs
GPUs and anonymous platform
Other client changes
•
Proxy auto-detection
•
Exclusive app feature
•
Don’t write state file on each checkpoint
Screensaver
•
Screensaver coordinator
–
configurable
•
New default screensaver
•
Intel screensaver
Scheduler/feeder
•
Handle multiple app versions per platform
•
Handle requests for multiple resources
•
–
app selection
–
completion estimate, deadline check
Show specific messages to users
–
•
Project-customized job check
–
•
“no work because you need driver version N”
jobs need different # of GPU processors
Mixed locality and non-locality scheduling
Server
•
Automated DB update
•
Protect admin web interface
Manager
•
Terms of use feature
•
Show only projects supporting platform
–
need to extend for GPUs
•
Advanced view is keyboard navigable
•
Manager can read cookies (Firefox, IE)
–
web-only install
Apps
•
Enhanced wrapper
–
•
checkpointing, fraction done
PyMW: master/worker Python system
Community contributions
•
Pootle-based translation system
–
•
Testing
–
•
alpha test project
Packaging
–
•
projects can use this
Linux client, server packages
Programming
–
lots of flames, little code
What didn’t get done
•
Replace runtime system
•
Installer: deal with “standby after X minutes”
•
Global shutdown switch
Things on hold
•
BOINC on mobile devices
•
Replace Simple GUI
Important things to do
•
New system for credit and runtime estimation
–
we have a design!
•
Keep track of GPU availability separately
•
Steer computers with GPUs towards projects
with GPU apps
Sample CUDA app
•
BOINC development
•
Let us know if you want something
•
If you make changes of general utility:
–
document them
–
add them to trunk