SNMPBased Monitoring

Early versions of the Solaris fault manager reported faults to the system log and console(s). It provided a wealth of status information using fmadm(1M). But these reporting mechanisms leave much to be desired; syslog messages must be parsed, and a busy central log host can easily lose important messages in the noise. Worse still, a privileged user must log into the affected system and run administrative commands to get information they need that is not contained in the message.

SNMP is a natural choice for extending the reach of the fault manager's voice; it is widely used to facilitate centralized monitoring of events throughout and even across administrative domains. The basic model is simple and extensible; information can be pushed from any device to one or more network management stations (NMSs), or pulled by an administrator or automated utility from a particular device of interest. Managed devices (in this case, a Solaris system) signify events using traps (also called notifications in SNMPv2), which provide a limited amount of information to designated NMSs. They also provide access to a management information base (MIB) on demand. Generally, the MIB provides access to a much greater breadth and depth of information than is transmitted with a trap or notification. An NMS can be configured to retrieve additional data from the MIB upon receipt of a trap if desired.

snmp-trapgen: an SNMP Plugin for fmd

The trap or notification generator component is snmp-trapgen. This is a very simple fault manager plugin similar to that which logs fault information to the system log and console. Instead of writing formatted text to a log device, however, this plugin generates SNMPvl traps and/or SNMPv2 notifications, one for each destination configured in the system-wide snmpd.conf(4).

No additional configuration is required; if you have already configured a system to send traps to one or more NMSs, you do not need to do anything else to be notified upon fault diagnosis. If not, you will want to add vl or v2 trap destinations to /etc/sma/snmp/snmpd.conf. The hostnames or addresses you use will need to be configured to receive and act upon SNMP traps or notifications. If you do not have an NMS on your network, you can use the snmptrapd(1M) server included with Solaris.

A fault diagnosis trap (sunFmProblemTrap) includes a limited subset of the information contained in the syslog message associated with the fault. Specifically, the diagnosis's universal unique identifier (UUID), diagnostic code, and reference URL are included. The object identifiers (OIDs) for these data are defined by the fault management MIB, SUN-FM-MIB, installed in /etc/sma/snmp/mibs/. The same information is delivered to both SNMPvl and SNMPv2 trap sinks. At present, this is the only trap defined by the fault management MIB, but others may be generated in the future. Here is an example of an SNMPv2 notification as decoded by snmptrapd(1M):

DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (2266748911) 262 days, 8:31:29.11 SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap

SUN-FM-MIB::sunFmProblemUUID."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: "a58aa105-4fab-6e16-8557-ab7687113de7"

SUN-FM-MIB::sunFmProblemCode."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: SUN4U-8000-KA

SUN-FM-MIB::sunFmProblemURL."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: http://sun.com/msg/SUN4U-8000-KA

The diagnostic code and URL can be used to find knowledge base articles describing the fault and suggested corrective action. The diagnosis UUID can be used to get further detail from fmdump(1M), or from the MIB, as seen in the next section.

libfmd_snmp: a MIB Plugin for the System Management Agent

Knowing that a fault has been diagnosed is important, but the amount of information delivered with the trap or notification may not be enough to provide an administrator with a complete understanding of the problem. The fault management MIB defines a wealth of detail, and this detail is made available via System Management Agent (SMA) by libfmd_snmp. In addition to fault diagnosis detail, this MIB also offers information about faulty components and the configuration of the fault manager itself, similar to that offered by fmadm(1M).

Enabling the plugin requires configuring the master SNMP agent on each server you wish to query. Adding the architecture-dependent line dlmod sunFM /usr/lib/fm/sparcv9/libfmd_snmp.so.1

to /etc/sma/snmp/snmpd.conf will cause the MIB plugin to be automatically loaded and initialized the next time the master agent is started.

No further configuration is necessary, although the usual snmpd.conf(4) directives will allow you to restrict access to the MIB, which may be important to you since some of the information it provides is ordinarily restricted to privileged users.

Details about devices the fault manager believes to be in degraded or faulted states is available using the sunFmResourceTable; "walking" this table provides a remote answer to the common question, "What is broken on that machine?" For this, you use the snmpwalk(1M) utility:

0 0

Post a comment

  • Receive news updates via email from this site