Scope of Work - NetApp Community

Download Report

Transcript Scope of Work - NetApp Community

Use Case: Extracting Performance data from OnCommand using APIs

Arda Oral - Professional Services Engineer



 Scope of Work  Environment  Performance Collection  Implementation – The Theory  Implementation – The Praxis (Demonstration)  SLA Thresholds  Dashboard

Scope of Work

 Customer wants to retrieve and store performance data of all storage controllers (NetApp and other vendors) in his common “performance database”  Customer defines SLAs to the performance values. SLA violations are to be imported into the database  Dashboard presenting SLA violations


Scope of Work

 Oncommand „Performance Advisor“ responsible for data collection  Performance data is stored in internal Sybase database  NMSDK APIs used to access Oncommand Performance data



 ~ 30 NetApp Storage Systems  OnCommand5 on a Windows 2008 Server  Oracle10 Database on AIX 5 (Performance DB)



AIX 5 http,https Oracle Performance DB NMSDK4.1

Windows 2008 OnCommand5 http,https


Performance Collection

 NetApp performance data is being collected by the CounterManager (CM) residing on the storage controller  CM groups data in objects, instances and counters  Data can be retrieved with storage controller „stats“ on a

Performance Collection

stats list objects

(aggregate, cifs, disk, lun, volume …) 

stats list instances

object name: aggregate, instance: aggr1 object name: system, instance: system object name: volume, instance: vol0 

stats list counters

object name: aggregate, counter: user_reads object name: system, counter: cpu_busy object name: lun, counter: avg_latency

Implementation – The Theory

 Install NMSDK 4.1on AIX5 server  Install required Perl Modules (SSL,LWP…)  Check NMDSK examples (basic, advanced) ..

/netapp-manageability-sdk 4.1/src/sample/DataFabric_Manager/API_Sample_Code/advanced/Perl/perf_cou nters/  Find appropriate API: perf-get-counter-data ..



NetApp Confidential - Internal Use Only

Implementation – The Theory (cont. 1)

perf-get-counter-data start-time end-time sample-rate instance-counter-info time-consolidation-method object-name-or-id counter-info API = Object = string/int = perf-object-counter object-type counter-name


Implementation – The Theory (cont. 2) Object/Instance/Counter

start-time end-time sample-rate objekt-name-or-id counter-name object-type time-consolidation-method


6h before now now 5 minutes storage controller cpu_busy system average Command on storage system: stats show -i 1 system:*:cpu_busy


SLA Thresholds

 CPU_BUSY > 90%  Disk_BUSY > 90% = SLA violation = SLA violation  LUN Latency > 20ms = SLA violation  TARGET Queue Full  = SLA violation if 10% of collected counter data exceed SLA threshold  storage system counter is flagged yellow ** if 20% of collected counter data exceed SLA threshold  storage system counter is flagged red


Dashboard (sample output) 14