TNG Alert- Ingres LogFile Full

Download Report

Transcript TNG Alert- Ingres LogFile Full

TNG Alert- Ingres LogFile Full !
Dennis Adams & Keith McLennan
BNP Paribas
Or...
How to build a simple
messaging interface to TNG
with emphasis on “simple” !
Health Warning
 This is not intended to be a “build your own
TNG Agent” course
 It is not a replacement for TNG Agents or
monitoring tools.
 No graphics or trend analysis
 Uses “quick and dirty” approach.
 Need to have some understanding of Unix
Shell Scripting
Why we started this
 Collecting Data from Ingres already using
Monitoring Tools
 Already using Unicenter/TNG as the Alert
Console for Operators
 Didn’t want “yet another console” on the
Operators Bridge.
 Why not send messages to TNG instead ?
 KISS !
Task Breakdown
 Ingres
–
–
–
–
Custom script to monitor the Ingres Log file
Has a threshold has been exceeded ?
Should we send a message to TNG ?
Format Message and send to the TNG console
 Unicenter
– Configure TNG Console to receive messages
– Incoming Messages create icons in “Unispace”
– Define Message Actions to display icons
Ingres Setup
Monitoring Ingres Log File
 Use Standard Ingres utilities piped through
grep and awk.
 Be aware of Ingres version differences...
if [ -f $II_SYSTEM/ingres/bin/cbf ]
then
PERCENT=`logstat | grep Percent | \
awk ‘{print $9}’`
else
PERCENT=`logstat | grep Percent | \
awk ‘{print $7}’`
fi
echo “Log Usage is $PERCENT”
Monitoring Ingres Log File (2)
 Using the IMA (Ingres II only)
 then parse the output !
sql -u’$ingres’ imadb
execute procedure
ima_set_vnode_domain \g
select * from ima_lgmo_lfb\g
\quit
Scheduling Monitoring
 Crontab scheduling as a last resort.
 Ensure environment is set up OK.
–
–
–
–
II_SYSTEM=
PATH=
LD_LIBRARY_PATH=
export II_SYSTEM
LD_LIBRARY_PATH
PATH
 Run a crontab under Ingres, or use a valid
ingres user
Privileges
 To use trace points like DM420
– ensure user has correct privs in “accessdb”.
 Ingres II
– edit config.dat to allow user to run infodb etc.
– Add the line:
ii.MACHINE_NAME.privileges.user.
MONI_USER:SERVER_CONTROL,NET_ADMIN,
MONITOR,TRUSTED
Checking for Thresholds
 Standard Shell logic..
if [ $PERCENT -ge 60 ]
then
echo “Houston, we might have a
problem”
fi
Minimising Message Traffic
 Advisable to reduce SNMP traffic.
 Only send messages if “state changed”
 Create Temporary “marker files” to indicate
whether message has been sent.
 After checking value..
– If a marker file exists - no action
– If marker file does not exist
• create one and send a message
GOODFILE=/tmp/log_good_message_sent
BADFILE=/tmp/log_bad_message_sent
If [ $PERCENT -ge 60 ]
then
rm -f $GOODFILE
if [ ! -f $BADFILE ]
then
touch $BADFILE
echo “Send Bad Message to TNG”
fi
else
rm -f $BADFILE
if [ ! -f $GOODFILE ]
then
touch $GOODFILE
echo “Send Good Message to TNG”
fi
fi
Sending Messages
 To send a message to Uncenter, raise a
SNMP trap using “awtrap”
/opt/tng/factory/bin/awtrap
\
ingsrv1 tngcons 162 public
\
1 6 3 1
\
“E
Ingres ingsrv1 log_file
Ingres_log_file_is_$PERCENT_full”
UNICENTER CONSOLE DISPLAYS..
%CATD_I_60_SNMPTRAP -c public
unknown * * 6 3 00:00:00 1 OID: 1.0
iso 0 VALUE: E Ingres ingsrv1 Ingres
Awtrap
 TNG-supplied binary file installed on “Agent
Machines”
 Used to forward SNMP messages to any
SNMP-compliant console.
 Syntax:
– awtrap
{from_addr | local}
dest_addr
port community enterprise type [subtype]
[oid] ["value"]
Awtrap Parameters
 from_addr
– “local” or use the IP name of the “originator”
 dest_addr
– IP name of TNG DSM machine (more later..)
 port
– UDP/IP port for SNMP traps = 162
 community
– “public”
 enterprise
– 1 indicates that it is an enterprise MIB entry
Awtrap Parameters (2)
 type
– 6 (major ID?)
 subtype
– 3 (minor ID - it works !!)
 oid
– 1 Should correspond to MIB definition.
 value
– Optional free-fomat text message. This is the
feature we exploit in our solution.
 But first, we need to set up Unicenter...
TNG Configuration
TNG Architecture
 Agents
– Data collection software on monitored server
 DSM: Distributed State Machine
– Polls Agents
– Receives Incoming traps & forwards to
CORE/EM (some basic filtering)
 EM: Event Manager
– Scrolling Event Display and “Message Actions”.
 CORE: Common Object Repository
– Database of 2D / 3D map info.
TNG Architecture
TNG console
TNG console
TNG console
EM
CORE DB
DSM
DSM
Server 1
Server 2
Server 3
Server 4
Server 5
TNG Setup - Discovery
 Command-Line routine run on Console.
 Needed for setting up icons on 3-D map
 Creates entries in Core Database for
machine discovered.
 Syntax:
– C:\> dscvrone -r {core lons000021}
-h {node name} -c {Snmp community Name}
-u {Core Username} -p {Password}
C:\> dscvrone -r “tngcons1”
-h “ingsrv1” -c “public”
-u “admin” -p “letmein”
TNG Setup - Filters
 Also need to change “Filters”
– ensure new machine is being polled, and
incoming messages will be received.
 Edit Filter File on DSM to include this
machine.
– %AGENTWORKSDIR%/services/config/
aws_wvgate/gwipflt.dat
– similar layout to “hosts” file
SNMP Message Structure
 MIB = Management Information Base
 Hierarchical message structure
computer
network
system
disks
applications
memory
controllers
databases
ingres
server
logfile
locking
MIBs and OIDs
 MIB “Tree” structure names and numbers
are registered with Standards Body.
 OIDs
– unique object at the “leaf” of the MIB
– unique heirarchical name
• computer.applications.databases.ingres.log_
file.percent
– has unique Numeric representation
• 1.2.6.5.7.9.1.8.5.4.2.6.3.2.1
– unique “meaning”
• e.g. “network router x is up”
How TNG Processes Messages
 Incoming Messages Received by the DSM
– If OID number is in the MIB
• automatic action based on MIB structure
– If OID is non-standard
• forward to the Event Manager as a nonstandard trap
• this is where we come in
 Event Manager Actions
– we program EM to ignore the OID
– parse the incoming message text.
Formatting Messages




EM processes space-delimited words.
Pad all spaces with “_”
Send the OID of “1” so it gets passed to EM
Only the text is important !
/opt/tng/factory/bin/awtrap
\
ingsrv1 tngcons 162 public
\
1 6 3 1
\
“E Ingres ingsrv1 log_file
Ingres_log_file_is_$PERCENT_full”
Parsing Messages
 “E Ingres ingsrv1 log_file
Ingres_log_file_is_$PERCENT_full”
– E/N = Error / Normal
– Update CORE: Create Folder called “Ingres” in
Unispace for this machine
– Create Icon called “ingsrv1_log_file” in this
Folder
– Change Icon to RED
– “rest_of_text_as_one_word” = supplementary
message
Creating Event Manager
Message Actions
 Console - DSM - Events - Messages
– (Not Message Actions !)
Creating Event Manager
Message Actions
 Enter Filter Pattern for the Message
– matched against incoming message, using “*”
wildcards
– Save.
 Chose Message Action
– Not exactly free-format text language !
– Card-based action line-by-line
%CATD_I_60_SNMPTRAP -c public unknown
* * 6 3 00:00:00 1 OID: 1.0 iso 0
VALUE: E Ingres * *
Processing Incoming Messages
 Update the Application Icon (“machinecategory”) to RED
 Send a Replacement Console Message
Seq Action
10
DISCARD
Text
20
UPDATEMAP -R &{$CAI_TNGREPOSITORY} -C Application -N
&18-&19 -I sabatch Critical > NUL
30
GOTO
100
40
SENDKEEP
Critical error received from &18 - brief message:
&19 - extended status: &20
50
EXIT
### IF RETURN CODE = 0
Creating Unispace Folder
 Create new Folder (Ingres-”machine”) if
unable to update it.
Seq Action
Text
100 UPDATEMAP -R &{$CAI_TNGREPOSITORY} -C BusinessView
-N &17-&18 -I sabatch Critical > NUL
110
GOTO
200
### IF RETURN CODE != 0
120
COMMAND
creaobj -c"BusinessView" -n&17-&18 -l&17-&18 r&{$CAI_TNGREPOSITORY} -u "sa" -p "pwd"
130
COMMAND
creaincl -c"BusinessView" -n&17-&18 l"Unispace" -a&18-Unispace -m r&{$CAI_TNGREPOSITORY} -u "sa" -p "pwd"
Creating Icon
 Create Icon (“machine-category”), move it
then update it.
Seq Action
200 COMMAND
Text
ceaobj -c"Application" -n&18-&19 -l&19 r&{$CAI_TNGREPOSITORY} -u "sa" -p "pwd"
210
COMMAND
creaincl -c"Application" -n&18-&19 l"BusinessView" -a&17-&18 -m r&{$CAI_TNGREPOSITORY} -u "sa" -p "pwd"
220
UPDATEMAP -R &{$CAI_TNGREPOSITORY} -C Application -N
&18-&19 -I sabatch Critical > NUL
GOTO
40
230
Putting it all together
Putting it all together
 Schedule tasks using Crontab
 logstat | grep to extract figures
 check for “badfile”/ “goodfile” to determine
when to send SNMP Traps
 Send OID “1” Trap with formatted text
string.
 Script TNG DSM to parse incoming text.
 Create icons in distinct folder in Unispace
as required
 Change icon colour as required.
Disadvantages
 Alerting Only
– no trend analysis, graphing supporting data.
 Very limited functionality
– Error trapping for Ingres down ?
 Asynchronous Traps only
– No communication from DSM
 If data is collected just from shell scripts
– no “keep-alive” etc.
– collection scheduling is a nightmare.
• Crontab !
TNG Alert- Ingres LogFile Full !
Dennis Adams & Keith McLennan
BNP Paribas