Transcript Document

Rational Configuration Design
To Prevent Irrational Problem Solving
John Murphy
Introduction
Basic
Advanced
Parents and
dependencies
Contacts
Managing exceptions
Automation
Hosts
Services
2012
2
Our Scenario
2012
3
Contacts
Contacts
Contact
User
Contact address for
support.
Login account for an
actual user.
Email, SMS,
Ticketing, etc.
No contact
information.
2012
5
Contacts
Contact Definition
define contact {
contact_name
contactgroups
email
use
}
define contact {
name
host_notifications_enabled
service_notifications_enabled
host_notification_period
service_notification_period
host_notification_options
service_notification_options
host_notification_commands
service_notification_commands
register
}
cu-contact
cg-main
[email protected]
contact-user
define contactgroup {
contactgroup_name
cg-main
alias
Kmart Contact
contactgroup_members vg-team
}
2012
contact-user
1
1
24x7
24x7
d,u
c
notify-h-email
notify-s-email
0
6
Contacts
User Definition
define contact {
contact_name
contactgroups
use
}
vu-jsmurphy
vg-team
read-contact
define contactgroup {
contactgroup_name
alias
}
vg-team
Kmart Team
define contact {
name
host_notifications_enabled
service_notifications_enabled
host_notification_period
service_notification_period
host_notification_options
service_notification_options
host_notification_commands
service_notification_commands
register
}
define contactgroup {
contactgroup_name
cg-main
alias
Kmart Contact
contactgroup_members vg-team
}
2012
read-contact
0
0
none
none
n
n
check_none
check_none
0
7
Contacts
LDAP/AD For Nagios Core
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
<Directory "/usr/local/nagios/sbin">
SetEnv TZ "Australia/Melbourne"
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Core"
AuthType Basic
# AuthUserFile /usr/local/nagios/etc/htpasswd.users
# Require valid-user
AuthBasicProvider ldap
AuthName “Nagios server"
AuthzLDAPAuthoritative off
AuthLDAPBindDN "CN=bindAccount,OU=User,DC=domain,DC=com"
AuthLDAPBindPassword xxxxxxxxx
AuthLDAPURL ldaps://domain.com/OU=User,DC=Domain,DC=com?sAMAccountName?sub?(objectClass=user)
AuthLDAPGroupAttribute member
AuthLDAPGroupAttributeIsDN on
Require ldap-group CN=NagiosAccessGroup,OU=Groups,DC=domain,DC=com
</Directory>
2012
8
Contacts Summary
Distinguish between your users and your
contacts.
Use an existing authentication source for your
user logins.
Consider the end-user experience… try to
ensure it’s easy to get the information they
need.
2012
9
Hosts
Hosts
Focus on minimizing host configuration to
make automation easier.
Use templates to assign user view information.
Create host groups based on shared
monitoring profiles.
2012
11
Hosts
Host Definitions
define host {
host_name
use
alias
address
parents
hostgroups
icon_image
register
}
define host {
name
srv-template
alias
Server host template
check_command
check_icmp!250.0,60%!500.0,80%
max_check_attempts 3
check_interval
10
retry_interval
2
check_period
24x7
contact_groups
cg-main
notification_interval
60
notification_period
24x7
notification_options
d,f
notifications_enabled 1
register
0
}
exchange01
srv-template
Exchange server
exchange01
switch001,switch002
srv-exchange, srv-windows
exchange.png
1
define hostgroup {
hostgroup_name srv-windows
alias
Windows group
}
2012
12
Hosts Summary
Minimize configuration in host objects to make
automation easier.
Hostnames allow for easier maintenance than
IP addresses.
Create logical host-groupings that will make
service assignment easier e.g. OS type,
Location, Applications it serves.
2012
13
Services
Services
Keep services as generic as possible to
prevent the need for duplicate services.
Minimizing service templates allows for easier
management and baseline changes.
Use service groups for applications.
2012
15
Services
Service Definitions
define service {
service_description
Windows C: usage
use
main-service-template
hostgroup_name
srv-windows,srv-v-windows
check_command
check_nt!USEDDISKSPACE!-w 80 -c 90
contact_groups
cg-main,cg-main-SMS
register
1
}
2012
define service {
name
service_description
max_check_attempts
check_interval
retry_interval
check_period
notification_interval
notification_period
notification_options
register
}
main-service-template
main service template
3
10
2
24x7
60
24x7
c
0
16
The puzzle completed
2012
17
Services Summary
Strike a balance between your servicetemplates and your service definitions.
Service groups are a very useful feature when
used appropriately, used inappropriately they
are an administrative burden.
Device life-cycle happens, ensure your
configuration isn’t burdened by overcomplexity.
2012
18
Advanced
Good Parenting (or how to not get woken up 20 times at ~3am)
Parenting
Service Dependencies
Use host parenting.
Parent indirectly
monitored services
with service
dependencies.
Use host parenting.
Use host parenting.
2012
20
Indirect Services
…And the art of dependencies
A typical ESX
monitoring setup…
Q. But what happens
when the vSphere
server fails?
2012
21
Indirect Services
…And the art of dependencies
A. Something like this
2012
22
Indirect Services
…And the art of dependencies
define service {
host_name
service_description
use
check_command
register
}
vSphereServer
Ping dependency
main-service-template
check_ping!100,80%!200,90%
1
define service {
service_description
use
hostgroup_name
check_command
contact_groups
register
}
CPU Usage
main-service-template
srv-v-windows
check_esx!CPU
cg-main
1
define servicedependency {
dependent_hostgroup_name
dependent_service_description
host_name
service_description
inherits_parent
execution_failure_criteria
notification_failure_criteria
dependency_period
}
2012
srv-v-windows
CPU Usage
vSphereServer
Ping dependency
1
w,u,c,p
w,u,c
24x7
23
Managing Exceptions
Clearly label
exceptions in your
config.
Make sure you can
use the same solution
again if necessary.
Image by Mike Bade: http://robotseatingpies.blogspot.com.au/2011/06/robotsdont-have-feelings_16.html
2012
24
Automation (or intrapreneurship ideas for the lazy)
Every piece of infrastructure is a potential data
source… make use of it!
AD/LDAP Servers.
Virtual infrastructure API’s.
Patching systems.
Asset databases.
Network management platforms.
Network LLDP/CDP tables.
SNMP enabled servers.
Help I’m running out of space!
2012
25
Q&A
Thanks For Listening!