Transcript Scope of Work - NetApp Community
Use Case: Extracting Performance data from OnCommand using APIs
Arda Oral - Professional Services Engineer
1
Agenda
Scope of Work Environment Performance Collection Implementation – The Theory Implementation – The Praxis (Demonstration) SLA Thresholds Dashboard
Scope of Work
Customer wants to retrieve and store performance data of all storage controllers (NetApp and other vendors) in his common “performance database” Customer defines SLAs to the performance values. SLA violations are to be imported into the database Dashboard presenting SLA violations
3
Scope of Work
Oncommand „Performance Advisor“ responsible for data collection Performance data is stored in internal Sybase database NMSDK APIs used to access Oncommand Performance data
4
Environment
~ 30 NetApp Storage Systems OnCommand5 on a Windows 2008 Server Oracle10 Database on AIX 5 (Performance DB)
5
Environment
AIX 5 http,https Oracle Performance DB NMSDK4.1
Windows 2008 OnCommand5 http,https
6
Performance Collection
NetApp performance data is being collected by the CounterManager (CM) residing on the storage controller CM groups data in objects, instances and counters Data can be retrieved with storage controller „stats“ on a
Performance Collection
stats list objects
(aggregate, cifs, disk, lun, volume …)
stats list instances
object name: aggregate, instance: aggr1 object name: system, instance: system object name: volume, instance: vol0
stats list counters
object name: aggregate, counter: user_reads object name: system, counter: cpu_busy object name: lun, counter: avg_latency
Implementation – The Theory
Install NMSDK 4.1on AIX5 server Install required Perl Modules (SSL,LWP…) Check NMDSK examples (basic, advanced) ..
/netapp-manageability-sdk 4.1/src/sample/DataFabric_Manager/API_Sample_Code/advanced/Perl/perf_cou nters/ Find appropriate API: perf-get-counter-data ..
/netapp-manageability-sdk-4.1/doc/WebHelp/index.htm
9
NetApp Confidential - Internal Use Only
Implementation – The Theory (cont. 1)
perf-get-counter-data start-time end-time sample-rate instance-counter-info time-consolidation-method object-name-or-id counter-info API = Object = string/int = perf-object-counter object-type counter-name
11
Implementation – The Theory (cont. 2) Object/Instance/Counter
start-time end-time sample-rate objekt-name-or-id counter-name object-type time-consolidation-method
Value
6h before now now 5 minutes storage controller cpu_busy system average Command on storage system: stats show -i 1 system:*:cpu_busy
12
SLA Thresholds
CPU_BUSY > 90% Disk_BUSY > 90% = SLA violation = SLA violation LUN Latency > 20ms = SLA violation TARGET Queue Full = SLA violation if 10% of collected counter data exceed SLA threshold storage system counter is flagged yellow ** if 20% of collected counter data exceed SLA threshold storage system counter is flagged red
13
Dashboard (sample output) 14
15