Novell Corporate Presentation Template 2007

Download Report

Transcript Novell Corporate Presentation Template 2007

Monitoring Your Data Center
Using Apache and Ganglia
Brad Nicholes
Sr. Software Engineer/Consultant, Novell
Member Apache Software Foundation
[email protected]
Agenda Ganglia Monitoring
•
Introduction and Overview
•
Ganglia Architecture
•
Apache Web Front End
•
Gmond & Gmetad
•
Extending Ganglia
•
2
–
GMetrics
–
Gmond Module Development
What’s New and What’s Coming
© Novell Inc. All rights reserved
Introduction and Overview
3
•
Scalable Distributed Monitoring System
•
Targeted at monitoring clusters and grids
•
Multicast-based listen/announce protocol
•
Depends on open standards
–
XML
–
XDR compact portable data transport
–
RRDTool – Round Robin Database
–
APR – Apache Portable Runtime
–
Apache HTTPD Server
–
PHP based web interface
•
Ganglia version 3.1.0 release July 2008
•
http://ganglia.sourceforge.net or http://www.ganglia.info
© Novell Inc. All rights reserved
Ganglia Architecture
• Gmond – Metric gathering agent installed on individual servers
• Gmetad – Metric aggregation agent installed on one or more
specific task oriented servers
• Apache Web Front End – Metric presentation and analysis server
• Characteristics
– Multicast – All gmond nodes are capable of listening to and reporting
on the status of the entire cluster
– Failover – Gmetad has the ability to switch which cluster node it polls
for metric data
– Lightweight and low overhead metric gathering and transport
• Ported to multiple platforms (Linux, FreeBSD, Solaris, others)
4
© Novell Inc. All rights reserved
Ganglia Architecture
Apache
W
eb
Frontend
W
eb
Client
G
M
ETAD
Poll
Poll
G
M
ETAD
Poll
Failover
Cluster 1
Failover
5
Failover
Cluster 2
G
M
O
ND
Node
G
M
O
ND
Node
Poll
Cluster 3
G
M
O
ND
Node
G
M
O
ND
Node
© Novell Inc. All rights reserved
G
M
O
ND
Node
G
M
O
ND
Node
G
M
O
ND
Node
G
M
O
ND
Node
G
M
O
ND
Node
Ganglia Web Front End
6
•
Built around Apache HTTPD server using mod_php
•
Uses presentation templates so that the web site “look
and feel” can be easily customized
•
Presents an overview of all nodes within a grid vs all
nodes in a cluster
•
Ability to drill down into individual nodes
•
Presents both textual and graphical views
© Novell Inc. All rights reserved
Ganglia Customized Web Front End
7
© Novell Inc. All rights reserved
Deploying Ganglia Monitoring
•
•
See http://ganglia.wiki.sourceforge.net/ganglia_gmond_configuration
Install Gmond on all monitored nodes
–
•
8
>
Add cluster and host information
>
Configure network upd_send_channel, udp_recv_channel, tcp_accept_channel
>
Start gmond
Installing Gmetad on an aggregation node
–
•
Edit the configuration file
Edit the configuration file
>
Add data and failover sources
>
Add grid name
>
Start gmetad
Installing the web front end
–
Install Apache httpd server with mod_php
–
Copy Ganglia web pages and PHP code to appropriate location
–
Add appropriate authentication configuration for access control
© Novell Inc. All rights reserved
Gmond Gathering & Gmetad Aggregation
Agents
Gmond – Metric Gathering Agent
•
Standard Metric Modules
–
•
•
Extensible
–
Loadable modules capable of gathering multiple metrics or
using advanced metric gathering APIs
–
Gmetric – Out-of-process utility capable of invoking command
line based metric gathering scripts
Built on the Apache Portable Runtime
–
10
CPU, Network I/O, Disk I/O, Memory and System
Supports Linux, FreeBSD, Solaris and more…
© Novell Inc. All rights reserved
Gmond – Metric Gathering Agent
•
•
11
Automatic discovery of nodes
–
Adding a node does not require configuration file changes
–
Each node is configured independently
–
Each node has the ability to listen to and/or talk on the multicast
channel
–
Can be configured for unicast connections if desired
–
Heartbeat metric determines the up/down status
Interfaces
–
Collection functions – Capable of running specialized functions for
gathering metric data
–
Multicast I/O – Listen/Send metric data from/to other nodes in the
same cluster
–
Data export listeners – Listen for client requests for cluster metric
data
© Novell Inc. All rights reserved
Gmond – Global Configuration
•
•
•
•
•
•
•
12
•©
Daemonize - When “yes”, gmond will daemonize
Setuid - When “yes”, gmond will set its effective UID
to the uid of the user specified by the user attribute
Debug_level - When set to zero (0), gmond will run
normally. Greater than zero, gmond runs in the
foreground and outputs debugging information
Mute - When “yes”, gmond will not send data
Deaf - When “yes”, gmond will not receive data
Host_dmax - When set to zero (0), gmond will not
delete a host from its list. If set to a positive number,
gmond will flush a host after it has not heard from it for
N seconds
Cleanup_threshold - Minimum amount of time before
gmond will cleanup expired data
Send_metadata_interval - Establishes an interval in
which gmond will send or resend the metadata
Novell Inc. All rights reserved
Gmond – Cluster Configuration
13
•
Name - Specifies the name of the cluster of machines
•
Owner - Specifies the administrators of the cluster
•
Latlong - Latitude and longitude GPS coordinates of
this cluster on earth
•
Url - Additional information about the cluster
© Novell Inc. All rights reserved
Gmond – Network Configuration
•
Udp_send_channel
–
–
–
•
Udp_recv_channel
–
–
–
•
mcast_join, mcast_if, Port – Multicast address, interface and port
Bind – Bind a particular local address
Family – Protocol family
Tcp_accept_channel
–
–
–
14
mcast_join, mcast_if – Multicast address and interface
Host – Unicast host
Port – Multicast or Unicast port
Bind, Port, Interface – Bind a particular local address, listen port
and interface
Family – Protocol family
Timeout – Request timeout
© Novell Inc. All rights reserved
Gmond – Configuration Example
globals {
daemonize = yes
setuid = yes
user = nobody
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
module_dir = /usr/lib/ganglia
}
cluster {
name = “My Cluster"
owner = “Administrator"
latlong = “N37.37 W122.23"
url = “http://www.moreinfo.org"
}
15
© Novell Inc. All rights reserved
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
ttl = 1
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
Gmond – Access Control
•
Configured in upd_recv_channel or
tcp_accept_channel sections
•
Examples:
–
–
16
“Deny all” with exceptions ->
“Allow all” with IPv4 & IPv6 exceptions ->
© Novell Inc. All rights reserved
acl {
default = "deny"
access {
ip = 192.168.0.4
mask = 32
action = "allow"
}
}
acl {
default = "allow"
access {
ip = 192.168.0.0
mask = 24
action = "deny"
}
access {
ip = ::ff:1.2.3.0
mask = 120
action = "deny"
}
}
Gmond – Metric Collection Groups
•
Specify as many collection groups as you like
•
Each collection group must contain at least one metric section
•
List available metrics by invoking “gmond -m”
•
Collection_group section:
•
17
–
Collect_once – Specifies that the group of static metrics
–
Collect_every – Collection interval (only valid for non-static)
–
Time_threshold – Max data send interval
Metric section:
–
Name – Metric name (see “gmond –m”)
–
Value_threshold – Metric variance threshold (send if exceeded)
–
Title – Optional user friendly title displayed in the web interface
© Novell Inc. All rights reserved
Gmond – Configuration Example
18
collection_group {
collect_once = yes
time_threshold = 20
metric {
name = "heartbeat"
}
}
collection_group {
collect_once = yes
time_threshold = 1200
metric {
name = "cpu_num"
title = “CPU Count”
}
metric {
name = "cpu_speed"
title = “CPU Speed”
}
metric {
name = "mem_total"
title = “Memory Total”
}
metric {
name = "swap_total"
title = “Swap Total”
}
…
} © Novell Inc. All rights reserved
collection_group {
collect_every = 20
time_threshold = 90
metric {
name = "load_one"
value_threshold = "1.0"
title = “One Minute Load Average”
}
metric {
name = "load_five"
value_threshold = "1.0"
title = “Five Minute Load Average”
}
…
}
collection_group {
collect_every = 80
time_threshold = 950
metric {
name = "proc_run"
value_threshold = "1.0"
title = “Running Processes”
}
metric {
name = "proc_total"
value_threshold = "1.0"
title = “Total Processes”
}
}
Gmetad – Metric Aggregation Agent
•
•
•
19
Polls a designated cluster node for the status of the
entire cluster
–
Data collection thread per cluster
–
Ability to poll gmond or another gmetad for metric data
Failover capability
RRDTool – Storage and trend graphing tool
–
Defines fixed size databases that hold data of various
granularity
–
Capable of rendering trending graphs from the smallest
granularity to the largest (eg. Last hour vs last year)
–
Never grows larger than the predetermined fixed size
–
Database granularity is configurable through gmetad.conf
© Novell Inc. All rights reserved
Gmetad - Configuration
•
Data source and and failover designations
–
•
RRD database storage definition
–
•
•
20
RRAs "RRA:AVERAGE:0.5:1:244" "RRA:AVERAGE:0.5:24:244"
"RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:244"
"RRA:AVERAGE:0.5:5760:374"
Access control
–
Trusted_hosts address1 address2 … DN1 DN2 …
–
All_trusted OFF/on
RRD files location
–
•
data_source "my cluster" [polling interval] address1:port addreses2:port ...
rrd_rootdir "/var/lib/ganglia/rrds"
Network
–
xml_port 8651
–
Interactive_port 8652
© Novell Inc. All rights reserved
Gmetad – Configuration Example
data_source "my cluster" 10 localhost my.machine.edu:8649 1.2.3.5:8655
data_source "my grid" 50 1.3.4.7:8655 grid.org:8651 grid-backup.org:8651
data_source "another source" 1.3.4.7:8655 1.3.4.8
trusted_hosts 127.0.0.1 169.229.50.165 my.gmetad.org
xml_port 8651
interactive_port 8652
rrd_rootdir "/var/lib/ganglia/rrds"
21
© Novell Inc. All rights reserved
Round-Robin Database Storage
Round-Robin Database (RRD)
•
•
•
•
23
High performance data logging and graphing system
for time series data
Automatic data consolidation over time
–
Define various Round-Robin Archives (RRA) which hold data
points at decreasing levels of granularity
–
Multiple data points from a more granular RRA are
automatically consolidated and added to a courser RRA
Constant and predictable data storage size
–
Old data is eliminated as new data is added to the RRD file
–
Amount of storage required is defined at the time the RRD file
is created
RRDTool Web site: http://oss.oetiker.ch/rrdtool/
© Novell Inc. All rights reserved
Ganglia Default RRD Definition
24
•
Definition of the Round-Robin Database format is
determined at database creation time
•
Default Ganglia RRA definitions:
–
RRA #1 – 15 second average for 61 minutes
–
RRA #2 – 6 minute average for 24.4 hours
–
RRA #3 – 42 minute average for 7.1 days
–
RRA #4 – 2.8 hour average for 28.5 days
–
RRA #5 – 24 hour average for 374 days
•
Default largest retrievable time series, ~1 year
•
Configurable to whatever you want
© Novell Inc. All rights reserved
Retrieving Data, Generating Graphs
and Interacting with RRD Files
25
•
RRDFetch – Retrieve time series data from an RRD
file for a specific time period
•
RRDInfo – Print header data from an RRD file in a
parsing friendly format
•
RRDGraph – Creates a graphical representation of the
specified time series data
•
RRDUpdate – Feed new data values into an RRD file
•
Other APIs – RRDCreate, RRDDump, RRDFirst,
RRDLast, RRDLastupdate, RRDResize, …
© Novell Inc. All rights reserved
Extending the Ganglia Monitoring System
Gmetric Service Level Metrics Utility
•
Extends the available metrics that can be produced
through Gmond
•
Ability to run specialized metric gathering scripts
•
Pushes metric data back through Gmond
•
Must be scheduled through cron rather than Gmond
•
Gmetric repository on Ganglia project site
–
27
http://ganglia.sourceforge.net/gmetric/
© Novell Inc. All rights reserved
Gmetric Command Line
gmetric --conf=./custom.conf -n "wow" -v "it works" -t "string"
Usage: gmetric [OPTIONS]...
-h, --help
Print help and exit
-V, --version
Print version and exit
-c, --conf=STRING
The configuration file to use for finding send channels
(default=`/etc/gmond.conf')
-n, --name=STRING
Name of the metric
-v, --value=STRING
Value of the metric
-t, --type=STRING
Either
string|int8|uint8|int16|uint16|int32|uint32|float|double
-u, --units=STRING
Unit of measure for the value e.g. Kilobytes, Celcius
(default=`')
-s, --slope=STRING
Either zero|positive|negative|both (default=`both')
-x, --tmax=INT
The maximum time in seconds between gmetric calls
(default=`60')
-d, --dmax=INT
The lifetime in seconds of this metric (default=`0')
28
© Novell Inc. All rights reserved
Gmond Pluggable Metric Modules
29
•
Extends the available metrics that can be gathered by
Gmond
•
Implemented as dynamically loadable modules
•
Configured through gmond.conf
•
Scheduled through Gmond rather than an external
scheduler
•
Module structure is similar to an Apache module
•
Able to produce multiple metrics from a single module
© Novell Inc. All rights reserved
Gmond Module Development
•
•
Three callback interfaces
–
Init
int (*ex_metric_init)(apr_pool_t *p);
–
Clean up
void (*ex_metric_cleanup)(void);
–
Handler
g_val_t (*ex_metric_handler)(int metric_index);
Metric definition structure
mmodule example_module =
{
STD_MMODULE_STUFF,
// Internal module definition
ex_metric_init,
// Metric init callback function
ex_metric_cleanup, // Metric cleanup callback function
ex_metric_info,
// Metric info data structure
ex_metric_handler, // Metric handler
};
30
© Novell Inc. All rights reserved
Gmond Example Module
mmodule example_module;
static int ex_metric_init(apr_pool_t *p)
{
apr_array_header_t *list_params =
example_module.module_params_list
srand(time(NULL)%99);
return 0;
}
static void ex_metric_cleanup ( void )
{
}
static g_val_t ex_metric_handler (
int metric_index )
{
g_val_t val;
switch (metric_index) {
case 0:
val.uint32 = rand()%99;
return val;
case 1:
val.uint32 = 50;
return val;
}
/* default case */
val.uint32 = 0;
return val;
}
31
© Novell Inc. All rights reserved
static const Ganglia_25metric
ex_metric_info[] =
{
{0, "Random_Numbers", 90,
GANGLIA_VALUE_UNSIGNED_INT, "s", both",
"%u", UDP_HEADER_SIZE+8,
"Example module metric (random numbers)"},
{0, "Constant_Number", 90,
GANGLIA_VALUE_UNSIGNED_INT, "Num", "zero",
"%u", UDP_HEADER_SIZE+8,
"Example module metric(constant number)"},
{0, NULL}
};
mmodule example_module =
{
STD_MMODULE_STUFF,
ex_metric_init,
ex_metric_cleanup,
ex_metric_info,
ex_metric_handler,
};
Gmond Example Module
Configuration
modules {
module {
name = “example_module”
path =
“/usr/lib/ganglia/modexample.so”
Param RandomMax {
Value = 75
}
Param ConstantValue {
Value = 25
}
}
}
32
© Novell Inc. All rights reserved
/* Define Collection Groups */
collection_group {
collect_every = 10
time_threshold = 50
metric {
name = “Random_Numbers”
title = “Random Number Metric”
value_threshold = 30.0
}
}
collection_group {
collect_once = yes
time_threshold = 20
metric {
name = “Constant_Number”
title = “Constant Number Metric”
}
}
Gmond Python Module Development
33
•
Extends the available metrics that can be gathered by
Gmond
•
Configured through the Gmond configuration file
•
Python module interface is similar to the C module
interface
•
Ability to save state within the script vs. a persistent
data store
•
Larger footprint but easier to implement new metrics
© Novell Inc. All rights reserved
Gmond Python Module Development
•
Three mandatory functions
–
–
–
34
metric_init(params)
>
Called once at module initialization time
>
Must return a metric description dictionary or list of dictionaries
>
Any other module initialization can also take place here
metric_handler(name) – may have multiple handlers
>
Metric gathering handler
>
Must return a single data value of the same type as specified in the metric
description dictionary returned by metric_init() function
metric_cleanup()
>
Called once at module termination time
>
Does not return a value
© Novell Inc. All rights reserved
Gmond Python Module Development
•
Metric definition data dictionary
d = {‘name’: ‘<your_metric_name>’,
‘call_back’: <call_back function>,
‘time_max’: int(<your_time_max>),
‘value_type’: ‘<string | uint | float | double>’,
‘units’: ’<your_units>’,
‘slope’: ‘<zero | positive | negative | both>’,
‘format’: ‘<your_format>’,
‘description’: ‘<your_description>’,
‘groups’: ‘<group names>’}
35
•
Can be a single dictionary or a list of dictionaries
•
Must be returned from the metric_init() function
© Novell Inc. All rights reserved
Gmond Python Module Development
Curve_Max = 15
v = int(1)
inc = int(1)
count = 0
def metric_init(params):
global Curve_Max
if ‘CurveMax’ in params:
Curve_Max = int(params[‘CurveMax’])
d = {‘name’: ‘Curve_Metric’,
‘call_back’: curve_handler,
‘time_max': int(60),
‘value_type’: ‘uint’,
‘units’: ‘Seconds’,
‘slope’: ‘both’,
‘format’: ‘%u’,
‘description’:
‘Shows a uniform curve’,
‘groups’: ‘Examples’}
return d
36
© Novell Inc. All rights reserved
def curve_handler(name):
global v,count,inc,Curve_Max
v += inc
count += 1
if count > Curve_Max:
count = 0
inc = -inc
return int(v)
def metric_cleanup():
pass
Gmond Python Module Deployment
•
Copy the .py file to the specified directory
–
•
37
The python modules directory is defined in the gmond.conf file
Start Gmond using the –m parameter
–
Shows a list of all available metrics known to Gmond
–
The python based metric should be in the list
•
Add the new python metric to a collection group just
like any other metric
•
Restart Gmond
© Novell Inc. All rights reserved
Configuring Gmond for Python
•
Must load the mod_python.so pluggable module
modules {
module {
name = "python_module"
path = "/usr/lib/ganglia/modpython.so"
params = "/usr/lib/ganglia/python_modules"
}
}
•
•
Must specify a python module path
–
The ‘params’ directive specifies the python module path
–
Mod_python will automatically load any .py module found in the
specified path
Recommend including the python metric module .pyconf files
from within the same .conf file that loads the python support
module
–
38
Include (‘/etc/ganglia/conf.d/*.pyconf’)
© Novell Inc. All rights reserved
What’s New and What’s Coming
•
•
39
New metric modules
–
Track individual CPUs
–
Track individual logical disks
–
Track TCP connections and status
Python version of Gmetad
–
Provides a pluggable module interface
–
Modules can modify how metrics are stored
–
Modules can be written to analyze metrics and produce events
•
Ability to enable/disable modules using a configuration directive
•
Pluggable web views
•
Spoofing modules – modules that can report metrics on behalf of
another host
© Novell Inc. All rights reserved
Questions
General Disclaimer
This document is not to be construed as a promise by any participating company to develop,
deliver, or market a product. It is not a commitment to deliver any material, code, or functionality,
and should not be relied upon in making purchasing decisions. Novell, Inc. makes no
representations or warranties with respect to the contents
of this document, and specifically disclaims any express or implied warranties of merchantability
or fitness for any particular purpose. The development, release, and timing of features or
functionality described for Novell products remains at the sole discretion of Novell. Further,
Novell, Inc. reserves the right to revise this document and to make changes to its content, at any
time, without obligation to notify any person or entity of such revisions or changes. All Novell
marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in
the United States and other countries. All third-party trademarks are the property of their
respective owners.
Section Break Text Goes Here (32pt)
Color Palette
BLUE
RED
RGB
0 166 238
RGB
224 0 0
TEAL
RGB
50 118 109
DK GRAY
RGB
60 60 65
44
GREEN
RGB
98 158 31
MD GRAY
RGB
90 90 100
© Novell Inc. All rights reserved
ORANGE
RGB
230 120 20
YELLOW
RGB
255 221 0
LT GRAY
RGB
204 204 205
Note:
The gray dotted-line box represents the margins or
“working area” into which all text and most graphics
and diagrams should conform.
How to Add Novell Colors to Your
OpenOffice Color Palette:
1. Go to the “Tools” menu
2. Select “Options”
3. Expand “OpenOffice.org”
4. Select “Colors”
5. Delete existing colors (one-by-one)
6. Add Novell Colors by giving them a name and entering
RGB values
7. Click “OK”
Graphics & Typeface
Flat
Bubble
3-D
RED
RED
ORANGE
ORANGE
GREEN
GREEN
BLUE
BLUE
GRAY
GRAY
Download Icon Library at: http://innerweb.novell.com/brandguide
How to Add Novell Icons to OpenOffice Gallery:
1. Go to the “Tools” menu
2. Select “Gallery”
3. In the Gallery window select “New Theme...”
4. With the “General” tab active name your new theme (ie.Red flat)
5. Select the “Files” tab.
6. Select “Find Files...”
7. Find the downloaded folder containing the icons named and click “Select”
8. Select “Add All” and then “OK”
9. Repeat for all icon groups
45
© Novell Inc. All rights reserved
Note:
Icons/Lines: This presentation
refresh simplifies the current
template and pushes focus on the
content being presented. The icon
library will continue to be utilized,
but a refresh will be noticeable with
the addition of the “Bubble” set of
icons, and a subtle color shift.
These icons are created to provide
a professional, consistent look.
When these icons are used
sparingly, and in direct relation to
the content on the slides, our
presentations will communicate and
work more effectively.
Typeface: Arial has been selected
as the new typeface for all Novell
communications. The following were
considered.
1. Our typeface needs to be
designed to carry information quickly
to the reader.
2. It needs to be usable for Novell
employees in company
correspondence and presentations,
as well as for outside vendors for
marketing and promotion.
3. It needs to easily function on the
Linux, Windows and Macintosh
platforms.
4. And finally, Arial was created for
these exact purposes.