Geekcorp Software


[ SNMP Monitor Ex ]

SNMP critical values monitoring.
Notification by cellular phone, email or syslog.

version 1.0.1
Sveinar Rasmussen, 14th of August 1997
Copyright (c) 1997 - University of Tromsoe, Norway.

      Abstract. This document describes the Tcl extension for monitoring static variables in routers or other agents using SNMP to access MIBs. The SNMP (Simple Network Management Protocol) uses polling or traps to communicate with the agent and the monitor equipment. MIBs are management information bases containing a wide range of interesting variables to be monitored.

1.0 Introduction

    Certain states inside a router are not expected to change. The states are reflected as variables in the MIB. For example, the voltage, ampere and temperature should remain static variables. If these should change for some reason, mysterious things might be in the happening - like routers crashing and undoubtedly rendering the local network unusable for a period of time.

    In order to shorten and possibly avoid unreachable networks as due to a router crash, we need monitoring. The "SNMP Monitor Ex" is a Scotty Tcl extension to the Tkined module. It adds the possibility to do strict monitoring of static variables. Whenever any changes occurs, notification messages are sent off to the network administrator to judge and possibly solve the problem.

2.0 Usage

    The SNMP Monitor Ex (named snmp_monitor_ex.tcl) script is extended from the original script snmp_monitor.tcl shipped with the Scotty v2.1.5 distribution. You can either add this script to your manager.tcl file for the internal menu system in Tkined, or simply load the script directly using the "Start Script" in the Tools menu found in Tkined. As yet another alternative, you can overwrite your snmp_monitor.tcl file with the snmp_monitor_ex.tcl file (not recommended due to further upgrades of the Scotty environment).

    Once started, you will notice the similarity of the original script. In fact, there are only two new entries for the monitor at this point. Click on one or more routers discovered and activate "Monitor Strict".

    Step 1)

      You are now supposed to enter all the variables you want to monitor. Separate the variables by space. Click on the "Start monitoring!" button when done. Use the "clear" button to clear current variable string settings.
 
Step 2)
    If the variable exists and it's the first time you launch the Strict Monitor, you will be presented with a window. This is where you specify certain settings concerning the monitor job to be initiated. These values will be saved as default values for later use. You can change the values later by clicking on the "Monitor Strict Jobs"->"Modify" menu selection. Description of the fields in the settings window:

      Send warnings to syslog. If this is true, the monitor will put the warning messages in the system's log file.

      Use SMS to send warnings. If this is true, the monitor will notify changes on your cellular phone. Your SMS message will contain the name of the variable, the old value, the new value and the IP of the machine where the changes occurred.

      SMS Cellular phone number. Specify the phone number to send off the warning messages to here.

      Use EMAIL to send warnings. If this is true, the monitor will notify changes in your mail.

      Email address. Specify the email address to which send messages.

      Delay between each SMS/Email (minutes). The monitor will not send SMS / Emails all the time as changes to the important variable occur. You can specify the number of minutes between each warning sent out using any of the external messaging systems (cellular or email). Default value is one hour (60 minutes).

    The settings you have specified in this window, are stored in memory to ease your typing.

    If you would like to change the delay between each variable readout, please use the standard "Set Monitor Parameter" in the menu. The strict monitoring facility uses this default value as polling interval.
     

Step 3)
    Hit "Ok!" to fire up the monitor job with the specified properties. If changes occur, the script will print these events in an "SNMP - Monitor Report" window. If you are annoyed with it and just want stuff to be sent to your cellular phone or mail box, please feel free to close this window.
     
Step 4)
    During the monitoring of your variable(s), you can modify the properties of each monitor job. Clicking your way in the main menu "SNMP Monitor Ex", you'll find the "Monitor Strict jobs" menu item. As you are given a list of current strict jobs, select the job you want to bring to attention and gently press "Modify".

    Now, you're presented with a similar window to the once initially seen during step 2). One difference is that this new window also include a "Kill job" button. Obviously, the reader might believe that one can kill the monitor jobs by using the standard "Modify Monitor Job"->"kill job" provided. Technically, it's true - the strict job would disappear but the internal variables related to each strict monitoring job will remain allocated in memory. Thus, to avoid wasting memory resources, it is preferred that you use the "Monitor Strict jobs" for deleting jobs instead of the standard job modification service.

    As you are finished monitoring the values you have selected, the monitor job can be deleted from the system. Click on "Modify monitor job" in the SNMP Monitor Ex menu. As you are given a list of current jobs, the jobs marked with StrictEvent are the ones you have created using the service provided by the Monitor Strict functionality. Select "Modify" and "Kill job" to end the monitoring of that specific variable on that specific machine.

2.0 Features.

    This section will explain some of the features included this the "SNMP Monitor Ex" as opposed to the original Tkined extension.
     
      • supports all types of variables found in the MIB. E.g. octet streams aren't likely to change as often as counters - if they do, you certainly want to know about it.
      • uses a configurable default value set instead of popping up a requester every time you decide to initiate a monitoring job.
      • supports multiselected nodes and multiple variables in requesters. A few variables contain multiple values (e.g. interfaces.ifTable.ifEntry.ifMtu). Support for this is added.
      • since this monitoring tool is meant for critical static supervision, there are no annoying chart diagrams for each monitoring job as found in the original. The system will do the monitoring and warn the user if anything happens.
      • warnings are sent off using the GSM SMS cellular phone service, email, syslog and of course the screen.
      • if you save a tkined map, the monitor jobs you've selected are saved. Jobs are restarted the next time you load the map.

3.0 Implementation issues.

    Every new procedure I've introduced in the snmp_monitor_ex.tcl script to provide this strict monitoring service, are described in this section.

    Monitor Strict

      As the user releases the left mouse button over the "Monitor Strict" menu item, this procedure is the first one to be called. It asks the user to enter all the variables to be monitored in the upcoming jobs. For every variable entered in the requester prompted to the user, the procedure calls "MonitorStrict" to handle each monitoring job.
       
    MonitorStrict
      In order to fire up a monitor job, we have to ensure that the variable exists and has an appropriate syntax. This procedure opens an SNMP connection to the specified node, checks for syntax, creates an unique identification value to be used later in the global array jArray for indexing of internal variables.
      Once finished, the SNMP connection is closed and "startStrict" is run.
       
    startStrict
      This procedure opens an SNMP connection to the specified node for each sub variable dug up in MonitorStrict. Upon a successful connection, the value, time, name and description are read from the node. These values are placed in the previously mentioned, global array jArray. It's a two dimensional array.
      StrictPrefs is then called in order to get the default values for the SMS and email notification service. The arguments to StrictPrefs are the ID and a TRUE boolean flag. The latter tells the StrictPrefs not to open a window to prompt the user unless it really has to.
      Finally, the job is started. At the end of each default interval, StrictEvent is run. The ID for the job is stored in the array as well. However, since every job can handle many variables for multivariable situations, only the first array entry get the job ID.
      The job properties are also stored in the tkined system. Issuing a "save" function call saves everything in order to be restored when a saved tkined map in loaded. Old jobs are then fired up again.
       
    StrictEvent
      The interrupt routine for each checking interval, will launch StrictShow to handle the action concerning every variable. A procedure like this also has to reconfigure the current job in order to receive further job interrupts. Configuration of the job is done by the configure command found in the job function.
       
    StrictShow
      All the action is found in this function. It reads a new value from the SNMP connection previously opened, and compares the newest value with the old one stored in the global array jArray. Network nodes might be temporarily down, and thus reading from that particular node would be impossible. A warning message is written to the screen if this should be the case.
      If the value can be read and has changed, a describing message is prompted to the screen and / or as an entry in the syslog. If it has been longer than a specified number of minutes since the last message was sent to either the SMS cellular phone or email, the warning is sent to respective receivers.
      The newly read value from the node is stored in jArray, and if an email or SMS message has been sent off, the time stamp for this happening is recorded for later use, as well.
       
    StrictPrefs
      The preferences procedure has two parameters: ID and a default flag. The ID points to the where in the global jArray updates are to occur. If the default flag is "true", then the function will not open a window asking the user for properties unless it has to. It has to ask the user for properties on the first initial job, but that's it - later occurrences use the default values once specified.
      However, if the default flag is "false", the user is forced to enter new default values in a window.
      Everything is stored in the global array jArray, as usual.
      If one job is killed, the entries for that particular job is deleted from jArray and the job is removed from the scheduling system.
       
    Monitor Strict jobs
      The user is able to change properties and delete jobs started by the strict monitor. This procedure will create a list of the strict jobs running. As the user chooses to modify any of these jobs, the StrictPrefs procedure is launched with the default flag parameter set to "false". The preferences window is overridden to open, new values are stored and will act a new default values for potential new jobs initiated.

4.0 Conclusion.

    Using the Scotty environment with Tcl extensions, I have extended the SNMP monitor to handle monitoring of any variable types in the MIBs. Notifications are sent to the screen, cellular phone, email or syslog. The Tcl script is operational. Albeit the project is in its early phases of development, the program can be put into operation and serve as a decent quality of service network monitoring tool.
    The notification service will give the network administrator an early warning system. Hopefully, this will shorten the duration of unreachable network situations.

5.0 ChangeLog

v0.0.0 - v0.0.2:
  • Initial internal versions.
v0.0.3:
  • Improved monitoring engine (supports DisplayString etc.).
  • Added clear button.
  • Forced popup of warning window if it was closed by user.
v0.5.1:
  • Added support for syslog entry.
  • It's now possible to remove the SMS / email delaying message system.
  • Added support for saving and restoring running jobs.
v0.5.2:
  • Added support to handle unconfigured sendmail, i.e. the program catches sendmail warnings.
  • Now handles nodes that are temporarily down. Prints out a warning to the screen - doesn't report this as an error on the SMS, syslog or email. This actually happens quite often (transient failures). Avoids unneccesary warnings.
  • Added date and time in the warnings strings. Exception is the syslog entry, which already has a time stamp. The format is a short string to get below the 160 bytes available for the SMS message.
  • Removed the persistent popup warning window method imposed by v0.0.3. Now uses a standard window. Remember to close the window using the menus. If you close it by the "close window" button, it will forever disappear.
  • Bugfix. The configuration window had problems with updating the timestamp for emails, syslog and SMS.
v1.0.1:
  • Official release.
  • General stability increased. Program crashed if one set two identical jobs on the same machine and tried to delete the jobs.

6.0 Source code.

    This software is property of the University of Tromsoe. Luckily, the software has been made available for download through the GNU license. And best of all, it's for free!

    Go to the download section to start downloading.



Sveinar Rasmussen (web)