Introduction To SNMP and Network Management

(Excerpted from  Essential SNMP, by Douglas Mauro and Kevin Schmidt )
 

Chapter 1. What Is SNMP?

 
In today's complex network of routers, switches, and servers, it can seem like a daunting task to manage all the devices on your network and make sure they're not only up and running but performing optimally. This is where the Simple Network Management Protocol (SNMP) can help. SNMP was introduced in 1988 to meet the growing need for a standard for managing Internet Protocol (IP) devices. SNMP provides its users with a "simple" set of operations that allows these devices to be managed remotely.

1.1. Network Management and Monitoring

The core of SNMP is a simple set of operations (and the information these operations gather) that gives administrators the ability to change the state of some SNMP-based device. For example, you can use SNMP to shut down an interface on your router or check the speed at which your Ethernet interface is operating. SNMP can even monitor the temperature on your switch and warn you when it is too high.

SNMP usually is associated with managing routers, but it's important to understand that it can be used to manage many types of devices. While SNMP's predecessor, the Simple Gateway Management Protocol (SGMP), was developed to manage Internet routers, SNMP can be used to manage Unix systems, Windows systems, printers, modem racks, power supplies, and more. Any device running software that allows the retrieval of SNMP information can be managed. This includes not only physical devices but also software, such as web servers and databases.

Another aspect of network management is network monitoring; that is, monitoring an entire network as opposed to individual routers, hosts, and other devices. Remote Network Monitoring (RMON) was developed to help us understand how the network itself is functioning, as well as how individual devices on the network are affecting the network as a whole. It can be used to monitor not only LAN traffic, but WAN interfaces as well. We discuss RMON in more detail later in this chapter and in Chapter 2, "A Closer Look at SNMP".

Before going any further, let's look at a before-and-after scenario that shows how SNMP can make a difference in an organization.

1.1.1. Before and After SNMP

Let's say that you have a network of 100 machines running various operating systems. Several machines are file servers, a few others are print servers, another is running software that verifies credit card transactions (presumably from a web-based ordering system), and the rest are personal workstations. In addition, there are various switches and routers that help keep the actual network going. A T1 circuit connects the company to the global Internet, and there is a private connection to the credit card verification system.

What happens when one of the file servers crashes? If it happens in the middle of the workweek, it is likely that the people using it will notice and the appropriate administrator will be called to fix it. But what if it happens after everyone has gone home, including the administrators, or over the weekend?

What if the private connection to the credit card verification system goes down at 10 p.m. on Friday and isn't restored until Monday morning? If the problem was faulty hardware and could have been fixed by swapping out a card or replacing a router, thousands of dollars in web site sales could have been lost for no reason. Likewise, if the T1 circuit to the Internet goes down, it could adversely affect the amount of sales generated by individuals accessing your web site and placing orders.

These are obviously serious problems -- problems that can conceivably affect the survival of your business. This is where SNMP comes in. Instead of waiting for someone to notice that something is wrong and locate the person responsible for fixing the problem (which may not happen until Monday morning, if the problem occurs over the weekend), SNMP allows you to monitor your network constantly, even when you're not there. For example, it will notice if the number of bad packets coming through one of your router's interfaces is gradually increasing, suggesting that the router is about to fail. You can arrange to be notified automatically when failure seems imminent, so you can fix the router before it actually breaks. You can also arrange to be notified if the credit card processor appears to get hung -- you may even be able to fix it from home. And if nothing goes wrong, you can return to the office on Monday morning knowing there won't be any surprises.

There might not be quite as much glory in fixing problems before they occur, but you and your management will rest more easily. We can't tell you how to translate that into a higher salary -- sometimes it's better to be the guy who rushes in and fixes things in the middle of a crisis, rather than the guy who makes sure the crisis never occurs. But SNMP does enable you to keep logs that prove your network is running reliably and show when you took action to avert an impending crisis.

1.1.2. Human Considerations

Implementing a network-management system can mean adding more staff to handle the increased load of maintaining and operating such an environment. At the same time, adding this type of monitoring should, in most cases, reduce the workload of your system-administration staff. You will need:
  • Staff to maintain the management station. This includes ensuring the management station is configured to properly handle events from SNMP-capable devices.
  • Staff to maintain the SNMP-capable devices. This includes making sure that workstations and servers can communicate with the management station.
  • Staff to watch and fix the network. This group is usually called a Network Operations Center (NOC) and is staffed 24/7. An alternative to 24/7 staffing is to implement rotating pager duty, where one person is on call at all times, but not necessarily present in the office. Pager duty works only in smaller networked environments, in which a network outage can wait for someone to drive into the office and fix the problem.
There is no way to predetermine how many staff members you will need to maintain a management system. The size of the staff will vary depending on the size and complexity of the network you're managing. Some of the larger Internet backbone providers have 70 or more people in their NOCs, while others have only one.

1.2. RFCs and SNMP Versions

The Internet Engineering Task Force(IETF) is responsible for defining the standard protocols that govern Internet traffic, including SNMP. The IETF publishes Requests for Comments(RFCs), which are specifications for many protocols that exist in the IP realm. Documents enter the standards track first as proposed standards, then move to draft status. When a final draft is eventually approved, the RFC is given standard status -- although there are fewer completely approved standards than you might think. Two other standards-track designations, historical and experimental, define (respectively) a document that has been replaced by a newer RFC and a document that is not yet ready to become a standard. The following list includes all the current SNMP versions and the IETF status of each (see Appendix D, "SNMP RFCs" for a full list of the SNMP RFCs):

 

The official site for RFCs is http://www.ietf.org/rfc.html. One of the biggest problems with RFCs, however, is finding the one you want. It is a little easier to navigate the RFC index at Ohio State University (http://www.cis.ohio-state.edu/services/rfc/index.html ).

1.3. Managers and Agents

In the previous sections we've vaguely referred to SNMP-capable devices and network-management stations. Now it's time to describe what these two things really are. In the world of SNMP there are two kind of entities: managers and agents. A manager is a server running some kind of software system that can handle management tasks for a network. Managers are often referred to as Network Management Stations (NMSs).[1] An NMS is responsible for polling and receiving traps from agents in the network. A poll, in the context of network management, is the act of querying an agent (router, switch, Unix server, etc.) for some piece of information. This information can later be used to determine if some sort of catastrophic event has occurred. A trap is a way for the agent to tell the NMS that something has happened. Traps are sent asynchronously, not in response to queries from the NMS. The NMS is further responsible for performing an action[2] based upon the information it receives from the agent. For example, when your T1 circuit to the Internet goes down, your router can send a trap to your NMS. In turn, the NMS can take some action, perhaps paging you to let you know that something has happened.

 

The second entity, the agent, is a piece of software that runs on the network devices you are managing. It can be a separate program (a daemon, in Unix language), or it can be incorporated into the operating system (for example, Cisco's IOS on a router, or the low-level operating system that controls a UPS). Today, most IP devices come with some kind of SNMP agent built in. The fact that vendors are willing to implement agents in many of their products makes the system administrator's or network manager's job easier. The agent provides management information to the NMS by keeping track of various operational aspects of the device. For example, the agent on a router is able to keep track of the state of each of its interfaces: which ones are up, which ones are down, etc. The NMS can query the status of each interface on a router, and take appropriate action if any of them are down. When the agent notices that something bad has happened, it can send a trap to the NMS. This trap originates from the agent and is sent to the NMS, where it is handled appropriately. Some devices will send a corresponding "all clear" trap when there is a transition from a bad state to a good state. This can be useful in determining when a problem situation has been resolved. Figure 1-1 shows the relationship between the NMS and an agent.

 

Figure 1-1

Figure 1-1. Relationship between an NMS and an agent

It's important to keep in mind that polls and traps can happen at the same time. There are no restrictions on when the NMS can query the agent or when the agent can send a trap.

1.4. The Structure of Management Information and MIBS

The Structure of Management Information (SMI) provides a way to define managed objects and their behavior. An agent has in its possession a list of the objects that it tracks. One such object is the operational status of a router interface (for example, up, down, or testing). This list collectively defines the information the NMS can use to determine the overall health of the device on which the agent resides.

 

The Management Information Base (MIB) can be thought of as a database of managed objects that the agent tracks. Any sort of status or statistical information that can be accessed by the NMS is defined in a MIB. The SMI provides a way to define managed objects, while the MIB is the definition (using the SMI syntax) of the objects themselves. Like a dictionary, which shows how to spell a word and then gives its meaning or definition, a MIB defines a textual name for a managed object and explains its meaning. Chapter 2, "A Closer Look at SNMP" goes into more technical detail about MIBs and the SMI.

 

An agent may implement many MIBs, but all agents implement a particular MIB called MIB-II [3] (RFC 1213). This standard defines variables for things such as interface statistics (interface speeds, MTU, octets[4] sent, octets received, etc.) as well as various other things pertaining to the system itself (system location, system contact, etc.). The main goal of MIB-II is to provide general TCP/IP management information. It doesn't cover every possible item a vendor may want to manage within its particular device.

 

[3]MIB-I is the original version of this MIB, but it is no longer referred to since MIB-II enhances it.

 

[4]An octet is an 8-bit quantity, which is the fundamental unit of transfer in TCP/IP networks.

 

What other kinds of information might be useful to collect? First, there are many draft and proposed standards developed to help manage things such as frame relay, ATM, FDDI, and services (mail, DNS, etc.). A sampling of these MIBs and their RFC numbers includes:

 

But that's far from the entire story, which is why vendors, and individuals, are allowed to define MIB variables for their own use.[5] For example, consider a vendor that is bringing a new router to market. The agent built into the router will respond to NMS requests (or send traps to the NMS) for the variables defined by the MIB-II standard; it probably also implements MIBs for the interface types it provides (e.g., RFC 2515 for ATM and RFC 2115 for Frame Relay). In addition, the router may have some significant new features that are worth monitoring but are not covered by any standard MIB. So, the vendor defines its own MIB (sometimes referred to as a proprietary MIB) that implements managed objects for the status and statistical information of their new router.

 

[5]This topic is discussed further in the next chapter.

 

TIP: Simply loading a new MIB into your NMS does not necessarily allow you to retrieve the data/values/objects, etc. defined within that MIB. You need to load only those MIBs supported by the agents from which you're requesting queries (e.g., snmpget, snmpwalk). Feel free to load additional MIBs for future device support, but don't panic when your device doesn't answer (and possibly returns errors for) these unsupported MIBs.

1.5. Host Management

Managing host resources (disk space, memory usage, etc.) is an important part of network management. The distinction between traditional system administration and network management has been disappearing over the last decade, and is now all but gone. As Sun Microsystems puts it, "The network is the computer." If your web server or mail server is down, it doesn't matter whether your routers are running correctly -- you're still going to get calls. The Host Resources MIB (RFC 2790) defines a set of objects to help manage critical aspects of Unix and Windows systems.[6]

 

[6]Any operating system running an SNMP agent can implement Host Resources; it's not confined to agents running on Unix and Windows systems.

 

Some of the objects supported by the Host Resources MIB include disk capacity, number of system users, number of running processes, and software currently installed. In today's e-commerce world, more and more people are relying on service-oriented web sites. Making sure your backend servers are functioning properly is as important as monitoring your routers and other communications devices.

 

Unfortunately, some agent implementations for these platforms do not implement this MIB, since it's not required.

1.6. A Brief Introduction to Remote Monitoring (RMON)

Remote Monitoring Version 1 (RMONv1, or RMON) is defined in RFC 2819; an enhanced version of the standard, called RMON Version 2 (RMONv2), is defined in RFC 2021. RMONv1 provides the NMS with packet-level statistics about an entire LAN or WAN. RMONv2 builds on RMONv1 by providing network- and application-level statistics. These statistics can be gathered in several ways. One way is to place an RMON probe on every network segment you want to monitor. Some Cisco routers have limited RMON capabilities built in, so you can use their functionality to perform minor RMON duties. Likewise, some 3Com switches implement the full RMON specification and can be used as full-blown RMON probes.

 

The RMON MIB was designed to allow an actual RMON probe to run in an offline mode that allows the probe to gather statistics about the network it's watching without requiring an NMS to query it constantly. At some later time, the NMS can query the probe for the statistics it has been gathering. Another feature that most probes implement is the ability to set thresholds for various error conditions and, when a threshold is crossed, alert the NMS with an SNMP trap. You can find a little more technical detail about RMON in the next chapter.

1.7. Getting More Information

Getting a handle on SNMP may seem like a daunting task. The RFCs provide the official definition of the protocol, but they were written for software developers, not network administrators, so it can be difficult to extract the information you need from them. Fortunately, many online resources are available. The most notable web site is the Network Management Server at the University at Buffalo (http://netman.cit.buffalo.edu). It contains useful links to other sites that provide similar information, as well as a network-management product list (http://netman.cit.buffalo.edu/Products.html) that includes both software and hardware vendors; it even has product reviews. This site is a great starting point in the search for network-management information and can be an extremely useful tool for determining what kinds of hardware and software are currently out there. Two more great web sites are the SimpleWeb (http://www.snmp.cs.utwente.nl) and SNMP Link (http://www.SNMPLink.org). The Simple Times, an online publication devoted to SNMP and network management, is also useful. You can find the current edition, and all the previous ones, at http://www.simple-times.org.

 

Another great resource is Usenet news. The newsgroup most people frequent is comp.dcom.net-management. Another good newsgroup is comp.protocols.snmp. Groups such as these promote a community of information sharing, allowing seasoned professionals to interact with individuals who are not as knowledgeable about SNMP or network management.

 

If you would like to know if a particular vendor has SNMP-compatible equipment, the Internet Assigned Numbers Authority (IANA) has compiled a list of the proprietary MIB files various vendors supply. The list can be found at ftp://ftp.iana.org/mib/. There is also an SNMP FAQ, available in two parts at http://www.faqs.org/faqs/snmp-faq/part1/ and http://www.faqs.org/faqs/snmp-faq/part2/.

Chapter 2. A Closer Look at SNMP

Contents:

In this chapter, we start to look at SNMP in detail. By the time you finish this chapter, you should understand how SNMP sends and receives information, what exactly SNMP communities are, and how to read MIB files. We'll also look in more detail at the three MIBs that were introduced in Chapter 1, "What Is SNMP?", namely MIB-II, Host Resources, and RMON.

 

 

2.1. SNMP and UDP

SNMP uses the User Datagram Protocol (UDP) as the transport protocol for passing data between managers and agents. UDP, defined in RFC 768, was chosen over the Transmission Control Protocol (TCP) because it is connectionless; that is, no end-to-end connection is made between the agent and the NMS when datagrams (packets) are sent back and forth. This aspect of UDP makes it unreliable, since there is no acknowledgment of lost datagrams at the protocol level. It's up to the SNMP application to determine if datagrams are lost and retransmit them if it so desires. This is typically accomplished with a simple timeout. The NMS sends a UDP request to an agent and waits for a response. The length of time the NMS waits depends on how it's configured. If the timeout is reached and the NMS has not heard back from the agent, it assumes the packet was lost and retransmits the request. The number of times the NMS retransmits packets is also configurable.

 

At least as far as regular information requests are concerned, the unreliable nature of UDP isn't a real problem. At worst, the management station issues a request and never receives a response. For traps, the situation is somewhat different. If an agent sends a trap and the trap never arrives, the NMS has no way of knowing that it was ever sent. The agent doesn't even know that it needs to resend the trap, because the NMS is not required to send a response back to the agent acknowledging receipt of the trap.

The upside to the unreliable nature of UDP is that it requires low overhead, so the impact on your network's performance is reduced. SNMP has been implemented over TCP, but this is more for special-case situations in which someone is developing an agent for a proprietary piece of equipment. In a heavily congested and managed network, SNMP over TCP is a bad idea. It's also worth realizing that TCP isn't magic, and that SNMP is designed for working with networks that are in trouble -- if your network never failed, you wouldn't need to monitor it. When a network is failing, a protocol that tries to get the data through but gives up if it can't is almost certainly a better design choice than a protocol that will flood the network with retransmissions in its attempt to achieve reliability.

SNMP uses the UDP port 161 for sending and receiving requests, and port 162 for receiving traps from managed devices. Every device that implements SNMP must use these port numbers as the defaults, but some vendors allow you to change the default ports in the agent's configuration. If these defaults are changed, the NMS must be made aware of the changes so it can query the device on the correct ports.

Figure 2-1 shows the TCP/IP protocol suite, which is the basis for all TCP/IP communication. Today, any device that wishes to communicate on the Internet (e.g., Windows NT systems, Unix servers, Cisco routers, etc.) must use this protocol suite. This model is often referred to as a protocol stack, since each layer uses the information from the layer directly below it and provides a service to the layer directly above it.

 

Figure 2-1

Figure 2-1. TCP/IP communication model and SNMP

When either an NMS or an agent wishes to perform an SNMP function (e.g., a request or trap), the following events occur in the protocol stack:
Application
First, the actual SNMP application (NMS or agent) decides what it's going to do. For example, it can send an SNMP request to an agent, send a response to an SNMP request (this would be sent from the agent), or send a trap to an NMS. The application layer provides services to an end user, such as an operator requesting status information for a port on an Ethernet switch.
UDP
The next layer, UDP, allows two hosts to communicate with one another. The UDP header contains, among other things, the destination port of the device to which it's sending the request or trap. The destination port will either be 161 (query) or 162 (trap).
IP
The IP layer tries to deliver the SNMP packet to its intended destination, as specified by its IP address.
Medium Access Control (MAC)
The final event that must occur for an SNMP packet to reach its destination is for it to be handed off to the physical network, where it can be routed to its final destination. The MAC layer is comprised of the actual hardware and device drivers that put your data onto a physical piece of wire, such as an Ethernet card. The MAC layer also is responsible for receiving packets from the physical network and sending them back up the protocol stack so they can be processed by the application layer (SNMP, in this case).
This interaction between SNMP applications and the network is not unlike that between two pen pals. Both have messages that need to be sent back and forth to one another. Let's say you decide to write your pen pal a letter asking if she would like to visit you over the summer. By deciding to send the invitation, you've acted as the SNMP application. Filling out the envelope with your pen pal's address is equivalent to the function of the UDP layer, which records the packet's destination port in the UDP header; in this case it's your pen pal's address. Placing a stamp on the envelope and putting it in the mailbox for the mailman to pick up is equivalent to the IP layer's function. The final act occurs when the mailman comes to your house and picks up the letter. From here the letter will be routed to its final destination, your pen pal's mailbox. The MAC layer of a computer network is equivalent to the mail trucks and airplanes that carry your letter on its way. When your pen pal receives the letter, she will go through the same process to send you a reply.

2.2. SNMP Communities

SNMPv1 and SNMPv2 use the notion of communities to establish trust between managers and agents. An agent is configured with three community names: read-only, read-write, and trap. The community names are essentially passwords; there's no real difference between a community string and the password you use to access your account on the computer. The three community strings control different kinds of activities. As its name implies, the read-only community string lets you read data values, but doesn't let you modify the data. For example, it allows you to read the number of packets that have been transferred through the ports on your router, but doesn't let you reset the counters. The read-write community is allowed to read and modify data values; with the read-write community string, you can read the counters, reset their values, and even reset the interfaces or do other things that change the router's configuration. Finally, the trap community string allows you to receive traps (asynchronous notifications) from the agent.

 

Most vendors ship their equipment with default community strings, typically public for the read-only community and private for the read-write community. It's important to change these defaults before your device goes live on the network. (You may get tired of hearing this because we say it many times, but it's absolutely essential.) When setting up an SNMP agent, you will want to configure its trap destination, which is the address to which it will send any traps it generates. In addition, since SNMP community strings are sent in clear text, you can configure an agent to send an SNMP authentication-failure trap when someone attempts to query your device with an incorrect community string. Among other things, authentication-failure traps can be very useful in determining when an intruder might be trying to gain access to your network.

 

Because community strings are essentially passwords, you should use the same rules for selecting them as you use for Unix or NT user passwords: no dictionary words, spouse names, etc. An alphanumeric string with mixed upper- and lowercase letters is generally a good idea. As mentioned earlier, the problem with SNMP's authentication is that community strings are sent in plain text, which makes it easy for people to intercept them and use them against you. SNMPv3 addresses this by allowing, among other things, secure authentication and communication between SNMP devices.

 

There are ways to reduce your risk of attack. IP firewalls or filters minimize the chance that someone can harm any managed device on your network by attacking it through SNMP. You can configure your firewall to allow UDP traffic from only a list of known hosts. For example, you can allow UDP traffic on port 161 (SNMP requests) into your network only if it comes from one of your network-management stations. The same goes for traps; you can configure your router so it allows UDP traffic on port 162 to your NMS only if it originates from one of the hosts you are monitoring. Firewalls aren't 100% effective, but simple precautions such as these do a lot to reduce your risk.

 

WARNING: It is important to realize that if someone has read-write access to any of your SNMP devices, he can gain control of those devices by using SNMP (for example, he can set router interfaces, switch ports down, or even modify your routing tables). One way to protect your community strings is to use a Virtual Private Network (VPN) to make sure your network traffic is encrypted. Another way is to change your community strings often. Changing community strings isn't difficult for a small network, but for a network that spans city blocks or more and has dozens (or hundreds or thousands) of managed hosts, changing community strings can be a problem. An easy solution is to write a simple Perl script that uses SNMP to change the community strings on your devices.

2.3. The Structure of Management Information

So far, we have used the term "management information" to refer to the operational parameters of SNMP-capable devices. However, we've said very little about what management information actually contains or how it is represented. The first step toward understanding what kind of information a device can provide is to understand how this data itself is represented within the context of SNMP. The Structure of Management Information Version 1(SMIv1, RFC 1155) does exactly that: it defines precisely how managed objects[7] are named and specifies their associated datatypes. The Structure of Management Information Version 2 (SMIv2, RFC 2578) provides enhancements for SNMPv2. We'll start by discussing SMIv1 and will discuss SMIv2 in the next section.

 

[7]For the remainder of this book "management information" will be referred to as "managed objects." Similarly, a single piece of management information (such as the operational status of a router interface) will be known as a "managed object."

 

The definition of managed objects can be broken down into three attributes:

 

Name
The name, or object identifier(OID), uniquely defines a managed object. Names commonly appear in two forms: numeric and "human readable." In either case, the names are long and inconvenient. In SNMP applications, a lot of work goes into helping you navigate through the namespace conveniently.

 

Type and syntax
A managed object's datatype is defined using a subset of Abstract Syntax Notation One(ASN.1). ASN.1 is a way of specifying how data is represented and transmitted between managers and agents, within the context of SNMP. The nice thing about ASN.1 is that the notation is machine-independent. This means that a PC running Windows NT can communicate with a Sun SPARC machine and not have to worry about things such as byte ordering.

 

Encoding
A single instance of a managed object is encoded into a string of octets using the Basic Encoding Rules(BER). BER defines how the objects are encoded and decoded so they can be transmitted over a transport medium such as Ethernet.

 

2.3.1. Naming OIDs

Managed objects are organized into a tree-like hierarchy. This structure is the basis for SNMP's naming scheme. An object ID is made up of a series of integers based on the nodes in the tree, separated by dots (.). Although there's a human-readable form that's more friendly than a string of numbers, this form is nothing more than a series of names separated by dots, each of which represents a node of the tree. So you can use the numbers themselves, or you can use a sequence of names that represent the numbers. Figure 2-2 shows the top few levels of this tree. (We have intentionally left out some branches of the tree that don't concern us here.)

 

Figure 2-2

Figure 2-2. SMI object tree

In the object tree, the node at the top of the tree is called the root, anything with children is called a subtree, and anything without children is called a leaf node. For example, Figure 2-2's root, the starting point for the tree, is called "Root-Node." Its subtree is made up of ccitt(0), iso(1), and joint(2). In this illustration, iso(1) is the only node that contains a subtree; the other two nodes are both leaf nodes. ccitt(0) and joint(2) do not pertain to SNMP, so they will not be discussed in this book.[8]

 

[8]The ccitt subtree is administered by the International Telegraph and Telephone Consultative Committee (CCITT); the joint subtree is administered jointly by the International Organization for Standardization (ISO) and CCITT. As we said, neither branch has anything to do with SNMP.

 

For the remainder of this book we will focus on the iso(1).org(3).dod(6 ).internet(1) subtree,[9] which is represented in OID form as 1.3.6.1 or as iso.org.dod.internet. Each managed object has a numerical OID and an associated textual name. The dotted-decimal notation is how a managed object is represented internally within an agent; the textual name, like an IP domain name, saves humans from having to remember long, tedious strings of integers.

 

[9]Note that the term "branch" is sometimes used interchangeably with "subtree."

 

The directory branch currently is not used. The management branch, or mgmt, defines a standard set of Internet management objects. The experimental branch is reserved for testing and research purposes. Objects under the private branch are defined unilaterally, which means that individuals and organizations are responsible for defining the objects under this branch. Here is the definition of the internet subtree, as well as all four of its subtrees:

 

internet      OBJECT IDENTIFIER ::= { iso org(3) dod(6) 1 }
directory     OBJECT IDENTIFIER ::= { internet 1 }
mgmt          OBJECT IDENTIFIER ::= { internet 2 }
experimental  OBJECT IDENTIFIER ::= { internet 3 }
private       OBJECT IDENTIFIER ::= { internet 4 }
The first line declares internet as the OID 1.3.6.1, which is defined as a subtree of iso.org.dod, or 1.3.6 (the ::= is a definition operator). The last four declarations are similar, but they define the other branches that belong to internet. For the directory branch, the notation { internet 1 } tells us that it is part of the internet subtree, and that its OID is 1.3.6.1.1. The OID for mgmt is 1.3.6.1.2, and so on.

 

There is currently one branch under the private subtree. It's used to give hardware and software vendors the ability to define their own private objects for any type of hardware or software they want managed by SNMP. Its SMI definition is:

 

enterprises   OBJECT IDENTIFIER ::= { private 1 }
The Internet Assigned Numbers Authority (IANA) currently manages all the private enterprise number assignments for individuals, institutions, organizations, companies, etc.[10] A list of all the current private enterprise numbers can be obtained from ftp://ftp.isi.edu/in-notes/iana/assignments/enterprise-numbers. As an example, Cisco Systems's private enterprise number is 9, so the base OID for its private object space is defined as iso.org.dod.internet.private.enterprises.cisco, or 1.3.6.1.4.1.9. Cisco is free to do as it wishes with this private branch. It's typical for companies such as Cisco that manufacture networking equipment to define their own private enterprise objects. This allows for a richer set of management information than can be gathered from the standard set of managed objects defined under the mgmt branch.

 

[10]The term "private enterprise" will be used throughout this book to refer to the enterprises branch.

 

Companies aren't the only ones who can register their own private enterprise numbers. Anyone can do so, and it's free. The web-based form for registering private enterprise numbers can be found at http://www.isi.edu/cgi-bin/iana/enterprise.pl. After you fill in the form, which asks for information such as your organization's name and contact information, your request should be approved in about a week. Why would you want to register your own number? When you become more conversant in SNMP, you'll find things you want to monitor that aren't covered by any MIB, public or private. With your own enterprise number, you can create your own private MIB that allows you to monitor exactly what you want. You'll need to be somewhat clever in extending your agents so that they can look up the information you want, but it's very doable.

 

2.3.2. Defining OIDs

The SYNTAX attribute provides for definitions of managed objects through a subset of ASN.1. SMIv1 defines several datatypes that are paramount to the management of networks and network devices. It's important to keep in mind that these datatypes are simply a way to define what kind of information a managed object can hold. The types we'll be discussing are similar to those that you'd find in a computer programming language like C. Table 2-1 lists the supported datatypes for SMIv1.

 

Table 2-1. SMIv1 Datatypes

Datatype

 

Description

 

INTEGER
A 32-bit number often used to specify enumerated types within the context of a single managed object. For example, the operational status of a router interface can be up, down, or testing. With enumerated types, 1 would represent up, 2 down, and 3 testing. The value zero (0) must not be used as an enumerated type, according to RFC 1155.

 

OCTET STRING
A string of zero or more octets (more commonly known as bytes) generally used to represent text strings, but also sometimes used to represent physical addresses.

 

Counter
A 32-bit number with minimum value 0 and maximum value 232 - 1 (4,294,967,295). When the maximum value is reached, it wraps back to zero and starts over. It's primarily used to track information such as the number of octets sent and received on an interface or the number of errors and discards seen on an interface. A Counter is monotonically increasing, in that its values should never decrease during normal operation. When an agent is rebooted, all Counter values should be set to zero. Deltas are used to determine if anything useful can be said for successive queries of Counter values. A delta is computed by querying a Counter at least twice in a row, and taking the difference between the query results over some time interval.

 

OBJECT IDENTIFIER
A dotted-decimal string that represents a managed object within the object tree. For example, 1.3.6.1.4.1.9 represents Cisco Systems's private enterprise OID.

 

NULL
Not currently used in SNMP.

 

SEQUENCE
Defines lists that contain zero or more other ASN.1 datatypes.

 

SEQUENCE OF
Defines a managed object that is made up of a SEQUENCE of ASN.1 types.

 

IpAddress
Represents a 32-bit IPv4 address. Neither SMIv1 nor SMIv2 discusses 128-bit IPv6 addresses; this problem will be addressed by the IETF's SMI Next Generation (SMING) working group (see http://www.ietf.org/html.charters/sming-charter.html ).

 

NetworkAddress
Same as the IpAddress type, but can represent different network address types.

 

Gauge
A 32-bit number with minimum value 0 and maximum value 232 - 1 (4,294,967,295). Unlike a Counter, a Gauge can increase and decrease at will, but it can never exceed its maximum value. The interface speed on a router is measured with a Gauge.

 

TimeTicks
A 32-bit number with minimum value 0 and maximum value 232 - 1 (4,294,967,295). TimeTicks measures time in hundredths of a second. Uptime on a device is measured using this datatype.

 

Opaque
Allows any other ASN.1 encoding to be stuffed into an OCTET STRING.

 

The goal of all these object types is to define managed objects. In Chapter 1, "What Is SNMP?", we said that a MIB is a logical grouping of managed objects as they pertain to a specific management task, vendor, etc. The MIB can be thought of as a specification that defines the managed objects a vendor or device supports. Cisco, for instance, has literally hundreds of MIBs defined for its vast product line. For example, its Catalyst device has a separate MIB from its 7000 series router. Both devices have different characteristics that require different management capabilities. Vendor-specific MIBs typically are distributed as human-readable text files that can be inspected (or even modified) with a standard text editor such as vi.

TIP: Most modern NMS products maintain a compact form of all the MIBs that define the set of managed objects for all the different types of devices they're responsible for managing. NMS administrators will typically compile a vendor's MIB into a format the NMS can use. Once a MIB has been loaded or compiled, administrators can refer to managed objects using either the numeric or human-readable object ID.

 

It's important to know how to read and understand MIB files. The following example is a stripped-down version of MIB-II (anything preceded by -- is a comment):

 

RFC1213-MIB DEFINITIONS ::= BEGIN

          IMPORTS
                  mgmt, NetworkAddress, IpAddress, Counter, Gauge,
                  TimeTicks
                          FROM RFC1155-SMI
                  OBJECT-TYPE
                          FROM RFC 1212;

          mib-2      OBJECT IDENTIFIER ::= { mgmt 1 }

-- groups in MIB-II

          system       OBJECT IDENTIFIER ::= { mib-2 1 }
          interfaces   OBJECT IDENTIFIER ::= { mib-2 2 }
          at           OBJECT IDENTIFIER ::= { mib-2 3 }
          ip           OBJECT IDENTIFIER ::= { mib-2 4 }
          icmp         OBJECT IDENTIFIER ::= { mib-2 5 }
          tcp          OBJECT IDENTIFIER ::= { mib-2 6 }
          udp          OBJECT IDENTIFIER ::= { mib-2 7 }
          egp          OBJECT IDENTIFIER ::= { mib-2 8 }
          transmission OBJECT IDENTIFIER ::= { mib-2 10 }
          snmp         OBJECT IDENTIFIER ::= { mib-2 11 }
        
          -- the Interfaces table

          -- The Interfaces table contains information on the entity's
          -- interfaces. Each interface is thought of as being
          -- attached to a 'subnetwork.' Note that this term should
          -- not be confused with 'subnet,' which refers to an
          -- addressing-partitioning scheme used in the Internet
          -- suite of protocols.

          ifTable OBJECT-TYPE
              SYNTAX  SEQUENCE OF IfEntry
              ACCESS  not-accessible
              STATUS  mandatory
              DESCRIPTION
                  "A list of interface entries. The number of entries is
                   given by the value of ifNumber."
              ::= { interfaces 2 }

          ifEntry OBJECT-TYPE
              SYNTAX  IfEntry
              ACCESS  not-accessible
              STATUS  mandatory
              DESCRIPTION
                  "An interface entry containing objects at the subnetwork
                   layer and below for a particular interface."
              INDEX   { ifIndex }
              ::= { ifTable 1 }

          IfEntry ::=
              SEQUENCE {
                  ifIndex
                      INTEGER,
                  ifDescr
                      DisplayString,
                  ifType
                      INTEGER,
                  ifMtu
                      INTEGER,
                  ifSpeed
                      Gauge,
                  ifPhysAddress
                      PhysAddress,
                  ifAdminStatus
                      INTEGER,
                  ifOperStatus
                      INTEGER,
                  ifLastChange
                      TimeTicks,
                  ifInOctets
                      Counter,
                  ifInUcastPkts
                      Counter,
                  ifInNUcastPkts
                      Counter,
                  ifInDiscards
                      Counter,
                  ifInErrors
                      Counter,
                  ifInUnknownProtos
                      Counter,
                  ifOutOctets
                      Counter,
                  ifOutUcastPkts
                      Counter,
                  ifOutNUcastPkts
                      Counter,
                  ifOutDiscards
                      Counter,
                  ifOutErrors
                      Counter,
                  ifOutQLen
                      Gauge,
                  ifSpecific
                      OBJECT IDENTIFIER
              }

          ifIndex OBJECT-TYPE
              SYNTAX  INTEGER
              ACCESS  read-only
              STATUS  mandatory
              DESCRIPTION
                  "A unique value for each interface. Its value ranges
                   between 1 and the value of ifNumber. The value for each
                   each interface must remain constant at least from one 
                   reinitialization of the entity's network-management
                   system to the next reinitialization." 

              ::= { ifEntry 1 }

          ifDescr OBJECT-TYPE
              SYNTAX  DisplayString (SIZE (0..255))
              ACCESS  read-only
              STATUS  mandatory
              DESCRIPTION
                  "A textual string containing information about the
                   interface. This string should include the name of
                   the manufacturer, the product name, and the version
                   of the hardware interface."
              ::= { ifEntry 2 }

END
The first line of this file defines the name of the MIB, in this case RFC1213-MIB. (RFC 1213 is the RFC that defines MIB-II; many of the MIBs we refer to are defined by RFCs). The format of this definition is always the same. The IMPORTS section of the MIB is sometimes referred to as the linkage section. It allows you to import datatypes and OIDs from other MIB files using the IMPORTS clause. This MIB imports the following items from RFC1155-SMI (RFC 1155 defines SMIv1, which we discussed earlier in this chapter):

 

  • mgmt

     

  • NetworkAddress

     

  • IpAddress

     

  • Counter

     

  • Gauge

     

  • TimeTicks

     

It also imports OBJECT-TYPE from RFC 1212, the Concise MIB Definition, which defines how MIB files are written. Each group of items imported using the IMPORTS clause uses a FROM clause to define the MIB file from which the objects are taken.

 

The OIDs that will be used throughout the remainder of the MIB follow the linkage section. This group of lines sets up the top level of the mib-2 subtree. mib-2 is defined as mgmt followed by .1. We saw earlier that mgmt was equivalent to 1.3.6.1.2. Therefore, mib-2 is equivalent to 1.3.6.1.2.1. Likewise, the interfaces group under mib-2 is defined as { mib-2 2 }, or 1.3.6.1.2.1.2.

 

After the OIDs are defined, we get to the actual object definitions. Every object definition has the following format:

 

<name> OBJECT-TYPE
    SYNTAX <datatype>
    ACCESS <either read-only, read-write, write-only, or not-accessible>
    STATUS <either mandatory, optional, or obsolete>
    DESCRIPTION 
        "Textual description describing this particular managed object."
    ::= { <Unique OID that defines this object> }     
The first managed object in our subset of the MIB-II definition is ifTable, which represents a table of network interfaces on a managed device (note that object names are defined using mixed case, with the first letter in lowercase). Here is its definition using ASN.1 notation:

 

ifTable OBJECT-TYPE
    SYNTAX  SEQUENCE OF IfEntry
    ACCESS  not-accessible
    STATUS  mandatory
    DESCRIPTION
        "A list of interface entries. The number of entries is given by
         the value of ifNumber."
    ::= { interfaces 2 }
The SYNTAX of ifTable is SEQUENCE OF IfEntry. This means that ifTable is a table containing the columns defined in IfEntry. The object is not-accessible, which means that there is no way to query an agent for this object's value. Its status is mandatory, which means an agent must implement this object in order to comply with the MIB-II specification. The DESCRIPTION describes what exactly this object is. The unique OID is 1.3.6.1.2.1.2.2, or iso.org.dod.internet.mgmt.interfaces.2.

 

Let's now look at the SEQUENCE definition from the MIB file earlier in this section, which is used with the SEQUENCE OF type in the ifTable definition:

 

IfEntry ::=
    SEQUENCE {
        ifIndex
            INTEGER,
        ifDescr
            DisplayString,
        ifType
            INTEGER,
        ifMtu
            INTEGER,
        .
        .
        .
        ifSpecific
            OBJECT IDENTIFIER
    }
Note that the name of the sequence (IfEntry) is mixed-case, but the first letter is capitalized, unlike the object definition for ifTable. This is how a sequence name is defined. A sequence is simply a list of columnar objects and their SMI datatypes, which defines a conceptual table. In this case, we expect to find variables defined by ifIndex, ifDescr, ifType, etc. This table can contain any number of rows; it's up to the agent to manage the rows that reside in the table. It is possible for an NMS to add rows to a table. This operation is covered later, in Section 2.6.4, "The set Operation".

 

Now that we have IfEntry to specify what we'll find in any row of the table, we can look back to the definition of ifEntry (the actual rows of the table) itself:

 

ifEntry OBJECT-TYPE
    SYNTAX  IfEntry
    ACCESS  not-accessible
    STATUS  mandatory
    DESCRIPTION
        "An interface entry containing objects at the subnetwork layer
         and below for a particular interface."
    INDEX   { ifIndex }
    ::= { ifTable 1 }
ifEntry
defines a particular row in the ifTable. Its definition is almost identical to that of ifTable, except we have introduced a new clause, INDEX. The index is a unique key used to define a single row in the ifTable. It's up to the agent to make sure the index is unique within the context of the table. If a router has six interfaces, ifTable will have six rows in it. ifEntry's OID is 1.3.6.1.2.1.2.2.1, or iso.org.dod.internet.mgmt.interfaces.ifTable.ifEntry. The index for ifEntry is ifIndex, which is defined as:

 

ifIndex OBJECT-TYPE
    SYNTAX  INTEGER
    ACCESS  read-only
    STATUS  mandatory
    DESCRIPTION
        "A unique value for each interface. Its value ranges between
         1 and the value of ifNumber. The value for each interface
         must remain constant at least from one reinitialization of the
         entity's network-management system to the next reinitialization." 
       ::= { ifEntry 1 }
The ifIndex object is read-only, which means we can see its value, but we cannot change it. The final object our MIB defines is ifDescr, which is a textual description for the interface represented by that particular row in the ifTable. Our MIB example ends with the END clause, which marks the end of the MIB. In the actual MIB-II files, each object listed in the IfEntry sequence has its own object definition. In this version of the MIB we list only two of them, in the interest of conserving space.

2.4. Extensions to the SMI in Version 2

SMIv2 extends the SMI object tree by adding the snmpV2 branch to the internet subtree, adding several new datatypes, and making a number of other changes. Figure 2-3 shows how the snmpV2 objects fit into the bigger picture; the OID for this new branch is 1.3.6.1.6.3.1.1, or iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects. SMIv2 also defines some new datatypes, which are summarized in Table 2-2.

 

Figure 2-3

Figure 2-3. SMIv2 registration tree for SNMPv2

Table 2-2. New Datatypes for SMIv2

Datatype

 

Description

 

Integer32
Same as an INTEGER.

 

Counter32
Same as a Counter.

 

Gauge32
Same as a Gauge.

 

Unsigned32
Represents decimal values in the range of 0 to 232 - 1 inclusive.

 

Counter64
Similar to Counter32, but its maximum value is 18,446,744,073,709,551,615. Counter64 is ideal for situations in which a Counter32 may wrap back to 0 in a short amount of time.

 

BITS
An enumeration of nonnegative named bits.

 

The definition of an object in SMIv2 has changed slightly from SMIv1. There are some new optional fields, giving you more control over how an object is accessed, allowing you to augment a table by adding more columns, and letting you give better descriptions. Here's the syntax of an object definition for SMIv2. The changed parts are in bold:

<name> OBJECT-TYPE
    SYNTAX <datatype>
    UnitsParts <Optional, see below>
    MAX-ACCESS <See below>
    STATUS <See below>
    DESCRIPTION 
        "Textual description describing this particular managed object."
    AUGMENTS { <name of table> }
    ::= { <Unique OID that defines this object> }
Table 2-3 briefly describes the object definition enhancements made in SMIv2.

 

Table 2-3. SMIv2 Object Definition Enhancements

Object Definition Enhancement

 

Description

 

UnitsParts
A textual description of the units (i.e., seconds, milliseconds, etc.) used to represent the object.

 

MAX-ACCESS
An OBJECT-TYPE's ACCESS can be MAX-ACCESS in SNMPv2. The valid options for MAX-ACCESS are read-only, read-write, read-create, not-accessible, and accessible-for-notify.

 

STATUS
This clause has been extended to allow the current, obsolete, and deprecated keywords. current in SNMPv2 is the same as mandatory in an SNMPv1 MIB.

 

AUGMENTS
In some cases it is useful to add a column to an existing table. The AUGMENTS clause allows you to extend a table by adding one or more columns, represented by some other object. This clause requires the name of the table the object will augment.

 

SMIv2 defines a new trap type called NOTIFICATION-TYPE, which we will discuss later in Section 2.6.7, "SNMP Notification". SMIv2 also introduces new textual conventions that allow managed objects to be created in more abstract ways. RFC 2579 defines the textual conventions used by SNMPv2, which are listed in Table 2-4.

Table 2-4. Textual Conventions for SMIv2

Textual Convention

 

Description

 

DisplayString
A string of NVT ASCII characters. A DisplayString can be no more than 255 characters in length.

 

PhysAddress
A media- or physical-level address, represented as an OCTET STRING.

 

MacAddress
Defines the media-access address for IEEE 802 (the standard for local area networks) in canonical[11] order. (In everyday language, this means the Ethernet address.) This address is represented as six octets.

 

TruthValue
Defines both true and false Boolean values.

 

TestAndIncr
Used to keep two management stations from modifying the same managed object at the same time.

 

AutonomousType
An OID used to define a subtree with additional MIB-related definitions.

 

VariablePointer
A pointer to a particular object instance, such as the ifDescr for interface 3. In this case, the VariablePointer would be the OID ifDescr.3.

 

RowPointer
A pointer to a row in a table. For example, ifIndex.3 points to the third row in the ifTable.

 

RowStatus
Used to manage the creation and deletion of rows in a table, since SNMP has no way of doing this via the protocol itself. RowStatus can keep track of the state of a row in a table, as well as receive commands for creation and deletion of rows. This textual convention is designed to promote table integrity when more than one manager is updating rows. The following enumerated types define the commands and state variables: active(1), notInService(2), notReady(3), createAndGo(4), createAndWait(5), anddestroy(6).

 

TimeStamp
Measures the amount of time elapsed between the device's system uptime and some event or occurrence.

 

TimeInterval
Measures a period of time in hundredths of a second. TimeInterval can take any integer value from 0-2147483647.

 

DateAndTime
An OCTET STRING used to represent date-and-time information.

 

StorageType
Defines the type of memory an agent uses. The possible values are other(1), volatile(2), nonVolatile(3), permanent(4), andreadOnly(5).

 

TDomain
Denotes a kind of transport service.

 

TAddress
Denotes the transport service address. TAddress is defined to be from 1-255 octets in length.

 

[11]Canonical order means that the address should be represented with the least-significant bit first.

2.5. A Closer Look at MIB-II

MIB-II is a very important management group, because every device that supports SNMP must also support MIB-II. Therefore, we will use objects from MIB-II in our examples throughout this book. We won't go into detail about every object in the MIB; we'll simply define the subtrees. The section of RFC1213-MIB that defines the base OIDs for the mib-2 subtree looks like this:

 

mib-2        OBJECT IDENTIFIER ::= { mgmt 1 }
system       OBJECT IDENTIFIER ::= { mib-2 1 }
interfaces   OBJECT IDENTIFIER ::= { mib-2 2 }
at           OBJECT IDENTIFIER ::= { mib-2 3 }
ip           OBJECT IDENTIFIER ::= { mib-2 4 }
icmp         OBJECT IDENTIFIER ::= { mib-2 5 }
tcp          OBJECT IDENTIFIER ::= { mib-2 6 }
udp          OBJECT IDENTIFIER ::= { mib-2 7 }
egp          OBJECT IDENTIFIER ::= { mib-2 8 }
transmission OBJECT IDENTIFIER ::= { mib-2 10 }
snmp         OBJECT IDENTIFIER ::= { mib-2 11 }
mib-2 is defined as iso.org.dod.internet.mgmt.1, or 1.3.6.1.2.1. From here, we can see that the system group is mib-2 1, or 1.3.6.1.2.1.1, and so on.
Figure 2-4 shows the MIB-II subtree of the mgmt branch.

 

Figure 2-4

Figure 2-4. MIB-II subtree

Table 2-5 briefly describes each of the management groups defined in MIB-II. We don't go into great detail about each group, since you can pull down RFC 1213 and read the MIB yourself.

 

Table 2-5. Brief Description of the MIB-II Groups

Subtree Name

 

OID

 

Description

 

system

 

1.3.6.1.2.1.1

 

Defines a list of objects that pertain to system operation, such as the system uptime, system contact, and system name.

 

interfaces

 

1.3.6.1.2.1.2

 

Keeps track of the status of each interface on a managed entity. The interfaces group monitors which interfaces are up or down and tracks such things as octets sent and received, errors and discards, etc.

 

at

 

1.3.6.1.2.1.3

 

The address translation (at) group is deprecated and is provided only for backward compatibility. It will probably be dropped from MIB-III.

 

ip

 

1.3.6.1.2.1.4

 

Keeps track of many aspects of IP, including IP routing.

 

icmp

 

1.3.6.1.2.1.5

 

Tracks things such as ICMP errors, discards, etc.

 

tcp

 

1.3.6.1.2.1.6

 

Tracks, among other things, the state of the TCP connection (e.g., closed, listen, synSent, etc.).

 

udp

 

1.3.6.1.2.1.7

 

Tracks UDP statistics, datagrams in and out, etc.

 

egp

 

1.3.6.1.2.1.8

 

Tracks various statistics about EGP and keeps an EGP neighbor table.

 

transmission

 

1.3.6.1.2.1.10

 

There are currently no objects defined for this group, but other media-specific MIBs are defined using this subtree.

 

snmp

 

1.3.6.1.2.1.11

 

Measures the performance of the underlying SNMP implementation on the managed entity and tracks things such as the number of SNMP packets sent and received.

 

2.6. SNMP Operations

We've discussed how SNMP organizes information, but we've left out how we actually go about gathering management information. Now, we're going to take a look under the hood to see how SNMP does its thing.

 

The Protocol Data Unit (PDU) is the message format that managers and agents use to send and receive information. There is a standard PDU format for each of the following SNMP operations:

 

  • get

     

  • get-next

     

  • get-bulk (SNMPv2 and SNMPv3)

     

  • set

     

  • get-response

     

  • trap

     

  • notification (SNMPv2 and SNMPv3)

     

  • inform (SNMPv2 and SNMPv3)

     

  • report (SNMPv2 and SNMPv3)

     

Let's take a look at each of these operations.

 

2.6.1. The get Operation

The get request is initiated by the NMS, which sends the request to the agent. The agent receives the request and processes it to best of its ability. Some devices that are under heavy load, such as routers, may not be able to respond to the request and will have to drop it. If the agent is successful in gathering the requested information, it sends a get-response back to the NMS, where it is processed. This process is illustrated in Figure 2-5.

 

Figure 2-5

Figure 2-5. get request sequence

How did the agent know what the NMS was looking for? One of the items in the get request is a variable binding. A variable binding, or varbind, is a list of MIB objects that allows a request's recipient to see what the originator wants to know. Variable bindings can be thought of as OID=value pairs that make it easy for the originator (the NMS, in this case) to pick out the information it needs when the recipient fills the request and sends back a response. Let's look at this operation in action:

 

$ snmpget cisco.ora.com public .1.3.6.1.2.1.1.6.0

system.sysLocation.0 = ""
TIP: All the Unix commands presented in this chapter come from the Net-SNMP agent package (formerly the UCD-SNMP project), a freely available Unix and Windows NT agent. Chapter 5, "Network-Management Software" provides a URL from which you can download the package. The commands in this package are summarized in Appendix C, "Net-SNMP Tools".

 

Several things are going on in this example. First, we're running a command on a Unix host. The command is called snmpget. Its main job is to facilitate the gathering of management data using a get request. We've given it three arguments on the command line: the name of the device we would like to query (cisco.ora.com), the read-only community string (public), and the OID we would like gathered (.1.3.6.1.2.1.1.6.0). If we look back at Table 2-5 we see that 1.3.6.1.2.1.1 is the system group, but there are two more integers at the end of the OID: .6 and .0. The .6 is actually the MIB variable that we wish to query; its human-readable name is sysLocation. In this case, we would like to see what the system location is set to on the Cisco router. As you can see by the response (system.sysLocation.0 = ""), the system location on this router currently is not set to anything. Also note that the response from snmpget is in variable binding format, OID=value.

 

There is one more thing to look at. Why does the MIB variable have a .0 tacked on the end? In SNMP, MIB objects are defined by the convention x.y, where x is the actual OID of the managed object (in our example, 1.3.6.1.2.1.1.6 ) and y is the instance identifier. For scalar objects (that is, objects that aren't defined as a row in a table) y is always 0. In the case of a table, the instance identifier lets you select a specific row of the table; 1 is the first row, 2 is the second row, etc. For example, consider the ifTable object we looked at earlier in this chapter. When looking up values in the ifTable, we would use a nonzero instance identifier to select a particular row in the table (in this case, a particular network interface).

 

TIP: Graphical NMS applications, which include most commercial packages, do not use command-line programs to retrieve management information. We use these commands to give you a feel for how the retrieval commands work and what they typically return. The information a graphical NMS retrieves and its retrieval process are identical to these command-line programs; the NMS just lets you formulate queries and displays the results using a more convenient GUI.

 

The getcommand is useful for retrieving a single MIB object at a time. Trying to manage anything in this manner can be a waste of time, though. This is where the get-next command comes in. It allows you to retrieve more than one object from a device, over a period of time.

 

2.6.2. The get-next Operation

The get-next operation lets you issue a sequence of commands to retrieve a group of values from a MIB. In other words, for each MIB object we want to retrieve, a separate get-next request and get-response are generated. The get-next command traverses a subtree in lexicographic order. Since an OID is a sequence of integers, it's easy for an agent to start at the root of its SMI object tree and work its way down until it finds the OID it is looking for. When the NMS receives a response from the agent for the get-next command it just issued, it issues another get-next command. It keeps doing this until the agent returns an error, signifying that the end of the MIB has been reached and there are no more objects left to get.

 

If we look at another example, we can see this behavior in action. This time we'll use a command called snmpwalk. This command simply facilitates the get-next procedure for us. It's invoked just like the snmpget command, except this time we specify which branch to start at (in this case, the system group):

 

$snmpwalk cisco.ora.com public system
system.sysDescr.0 = "Cisco Internetwork Operating System Software 
..IOS (tm) 2500 Software (C2500-I-L), Version 11.2(5), RELEASE 
SOFTWARE (fc1)..Copyright (c) 1986-1997 by cisco Systems, Inc...
Compiled Mon 31-Mar-97 19:53 by ckralik"
system.sysObjectID.0 = OID: enterprises.9.1.19
system.sysUpTime.0 = Timeticks: (27210723) 3 days, 3:35:07.23
system.sysContact.0 = ""
system.sysName.0 = "cisco.ora.com"
system.sysLocation.0 = ""
system.sysServices.0 = 6
The get-next sequence returns seven MIB variables. Each of these objects is part of the system group as it's defined in RFC 1213. We see a system object ID, the amount of time the system has been up, the contact person, etc.

 

Given that you've just looked up some object, how does get-next figure out which object to look up next? get-next is based on the concept of the lexicographic ordering of the MIB's object tree. This order is made much simpler because every node in the tree is assigned a number. To understand what this means, let's start at the root of the tree and walk down to the system node.

 

To get to the system group (OID 1.3.6.1.2.1.1), we start at the root of the object tree and work our way down. Figure 2-6 shows the logical progression from the root of the tree all the way to the system group. At each node in the tree, we visit the lowest-numbered branch. Thus, when we're at the root node, we start by visiting ccitt. This node has no nodes underneath it, so we move to the iso node. Since iso does have a child we move to that node, org. The process continues until we reach the system node. Since each branch is made up of ascending integers (ccitt(0) iso(1) join(2), for example), the agent has no problem traversing this tree structure all the way down to the system(1) group. If we were to continue this walk, we'd proceed to system.1 (system.sysLocation), system.2, and the other objects in the system group. Next, we'd go to interfaces(2), and so on.

 

Figure 2-6

Figure 2-6. Walking the MIB tree

2.6.3. The get-bulk Operation

SNMPv2 defines the get-bulk operation, which allows a management application to retrieve a large section of a table at once. The standard get operation can attempt to retrieve more than one MIB object at once, but message sizes are limited by the agent's capabilities. If the agent can't return all the requested responses, it returns an error message with no data. The get-bulk operation, on the other hand, tells the agent to send as much of the response back as it can. This means that incomplete responses are possible. Two fields must be set when issuing a get-bulk command: nonrepeaters and max-repetitions. Nonrepeaters tells the get-bulk command that the first N objects can be retrieved with a simple get-next operation. Max-repetitions tells the get-bulk command to attempt up to M get-next operations to retrieve the remaining objects. Figure 2-7 shows the get-bulk command sequence.

 

Figure 2-7

Figure 2-7. get-bulk request sequence

In Figure 2-7, we're requesting three bindings: sysDescr, ifInOctets, and ifOutOctets. The total number of variable bindings that we've requested is given by the formula N + (M * R), where N is the number of nonrepeaters (i.e., scalar objects in the request -- in this case 1, because sysDescr is the only scalar object), M is max-repetitions (in this case, we've set it arbitrarily to 3), and R is the number of nonscalar objects in the request (in this case 2, because ifInOctets and ifOutOctets are both nonscalar). Plugging in the numbers from this example, we get 1 + (3 * 2) = 7, which is the total number of variable bindings that can be returned by this get-bulk request.

 

The Net-SNMP package comes with a command for issuing get-bulk queries. If we execute this command using all the parameters previously discussed, it will look like the following:

 

$ snmpbulkget -v2c -B 1 3 linux.ora.com public sysDescr ifInOctets ifOutOctets
system.sysDescr.0 = "Linux linux 2.2.5-15 #3 Thu May 27 19:33:18 EDT 1999 i686"
interfaces.ifTable.ifEntry.ifInOctets.1 = 70840
interfaces.ifTable.ifEntry.ifOutOctets.1 = 70840
interfaces.ifTable.ifEntry.ifInOctets.2 = 143548020
interfaces.ifTable.ifEntry.ifOutOctets.2 = 111725152
interfaces.ifTable.ifEntry.ifInOctets.3 = 0
interfaces.ifTable.ifEntry.ifOutOctets.3 = 0
Since get-bulk is an SNMPv2 command, you have to tell snmpgetbulk to use an SNMPv2 PDU with the -v2c option. The nonrepeaters and max-repetitions are set with the -B 1 3 option. This sets nonrepeaters to 1 and max-repetitions to 3. Notice that the command returned seven variable bindings: one for sysDescr and three each for ifInOctets and ifOutOctets.

 

2.6.4. The set Operation

The set command is used to change the value of a managed object or to create a new row in a table. Objects that are defined in the MIB as read-write or write-only can be altered or created using this command. It is possible for an NMS to set more than one object at a time.

 

Figure 2-8

Figure 2-8. set request sequence

Figure 2-8 shows the set request sequence. It's similar to the other commands we've seen so far, but it is actually changing something in the device's configuration, as opposed to just retrieving a response to a query. If we look at an example of an actual set, you will see the command take place. The following example queries the sysLocation variable, then sets it to a value:

 

$ snmpget cisco.ora.com public system.sysLocation.0
system.sysLocation.0 = ""
$ snmpset cisco.ora.com private system.sysLocation.0 s "Atlanta, GA"
system.sysLocation.0 = "Atlanta, GA"
$ snmpget cisco.ora.com public system.sysLocation.0
system.sysLocation.0 = "Atlanta, GA"
The first command is the familiar get command, which displays the current value of sysLocation. In one of the previous examples we saw that it was undefined; this is still the case. The second command is snmpset. For this command, we supply the hostname, the read-write community string (private), and the variable we want to set (system.sysLocation.0), together with its new value (s "Atlanta, GA"). The s tells snmpset that we want to set the value of sysLocation to a string; and "Atlanta, GA" is the new value itself. How do we know that sysLocation requires a string value? The definition of sysLocation in RFC 1213 looks like this:

 

sysLocation OBJECT-TYPE
    SYNTAX  DisplayString (SIZE (0..255))
    ACCESS  read-write
    STATUS  mandatory
    DESCRIPTION
        "The physical location of this node (e.g., 'telephone closet,
         3rd floor')."
    ::= { system 6 }
The SYNTAX for sysLocation is DisplayString (SIZE (0..255)), which means that it's a string with a maximum length of 255 characters. The snmpset command succeeds and reports the new value of sysLocation. But just to confirm, we run a final snmpget, which tells us that the set actually took effect. It is possible to set more than one object at a time, but if any of the sets fail, they all fail (i.e., no values are changed). This behavior is intended.

 

2.6.5. get, get-next, get-bulk, and set Error Responses

Error responses help you determine wether your get or set request was processed correctly by the agent. The get, get-next, and set operations can return the error responses shown in Table 2-6. The error status for each error is show in parentheses.

 

Table 2-6. SNMPv1 Error Messages

SNMPv1 Error Message

 

Description

 

noError(0)
There was no problem performing the request.

 

tooBig(1)
The response to your request was too big to fit into one response.

 

noSuchName(2)
An agent was asked to get or set an OID that it can't find; i.e., the OID doesn't exist.

 

badValue(3)
A read-write or write-only object was set to an inconsistent value.

 

readOnly(4)
This error is generally not used. The noSuchName error is equivalent to this one.

 

genErr(5)
This is a catch-all error. If an error occurs for which none of the previous messages is appropriate, a genError is issued.

 

The SNMPv1 error messages are not very robust. In an attempt to fix this problem, SNMPv2 defines additional error responses that are valid for get, set, get-next, and get-bulk operations, provided that both the agent and NMS support SNMPv2. These responses are listed in Table 2-7.

Table 2-7. SNMPv2 Error Messages

SNMPv2 Error Message

 

Description

 

noAccess(6)
A set to an inaccessible variable was attempted. This typically occurs when the variable has an ACCESS type of not-accessible.

 

wrongType(7)
An object was set to a type that is different from its definition. This error will occur if you try to set an object that is of type INTEGER to a string, for example.

 

wrongLength(8)
An object's value was set to something other than what it calls for. For instance, a string can be defined to have a maximum character size. This error occurs if you try to set a string object to a value that exceeds its maximum length.

 

wrongEncoding(9)
A set operation was attempted using the wrong encoding for the object being set.

 

wrongValue(10)
A variable was set to a value it doesn't understand. This can occur when a read-write is defined as an enumeration, and you try to set it to a value that is not one of the enumerated types.

 

noCreation(11)
You tried to set a nonexistent variable or create a variable that doesn't exist in the MIB.

 

inconsistentValue
A MIB variable is in an inconsistent state, and is not accepting any set requests.

 

resourceUnavailable(13)
No system resources are available to perform a set.

 

commitFailed(14)
This is a catch-all error for set failures.

 

undoFailed(15)
A set failed and the agent was unable to roll back all the previous sets up until the point of failure.

 

authorizationError(16)
An SNMP command could not be authenticated; in other words, someone has supplied an incorrect community string.

 

notWritable(17)
A variable will not accept a set, even though it is supposed to.

 

inconsistentName(18)
You attempted to set a variable, but that attempt failed because the variable was in some kind of inconsistent state.

 

 

2.6.6. SNMP Traps

A trap is a way for an agent to tell the NMS that something bad has happened. In the Section 1.3, "Managers and Agents" of Chapter 1, "What Is SNMP?" we explored the notion of traps at a general level; now we'll look at them in a bit more detail. Figure 2-9 shows the trap-generation sequence.

 

Figure 2-9

Figure 2-9. Trap generation

The trap originates from the agent and is sent to the trap destination, as configured within the agent itself. The trap destination is typically the IP address of the NMS. No acknowledgment is sent from the NMS to the agent, so the agent has no way of knowing if the trap makes it to the NMS. Since SNMP uses UDP, and since traps are designed to report problems with your network, traps are especially prone to getting lost and not making it to their destinations. However, the fact that traps can get lost doesn't make them any less useful; in a well-planned environment, they are an integral part of network management. It's better for your equipment to try to tell you that something is wrong, even if the message may never reach you, than simply to give up and let you guess what happened. Here are a few situations that a trap might report:

 

  • A network interface on the device (where the agent is running) has gone down.

     

  • A network interface on the device (where the agent is running) has come back up.

     

  • An incoming call to a modem rack was unable to establish a connection to a modem.

     

  • The fan on a switch or router has failed.

     

When an NMS receives a trap, it needs to know how to interpret it; that is, it needs to know what the trap means and how to interpret the information it carries. A trap is first identified by its generic trap number. There are seven generic trap numbers (0-6), shown in Table 2-8. Generic trap 6 is a special catch-all category for "enterprise-specific" traps, which are traps defined by vendors or users that fall outside of the six generic trap categories. Enterprise-specific traps are further identified by an enterprise ID (i.e., an object ID somewhere in the enterprises branch of the MIB tree, iso.org.dod.internet.private.enterprises) and a specific trap number chosen by the enterprise that defined the trap. Thus, the object ID of an enterprise-specific trap is enterprise-id.specific-trap-number. For example, when Cisco defines special traps for its private MIBs, it places them all in its enterprise-specific MIB tree (iso.org.dod.internet.private.enterprises.cisco). As we'll see in Chapter 10, "Traps", you are free to define your own enterprise-specific traps; the only requirement is that you register your own enterprise number with IANA.

 

A trap is usually packed with information. As you'd expect, this information is in the form of MIB objects and their values; as mentioned earlier, these object-value pairs are known as variable bindings. For the generic traps 0 through 5, knowledge of what the trap contains is generally built into the NMS software or trap receiver. The variable bindings contained by an enterprise-specific trap are determined by whomever defined the trap. For example, if a modem in a modem rack fails, the rack's agent may send a trap to the NMS informing it of the failure. The trap will most likely be an enterprise-specific trap defined by the rack's manufacturer; the trap's contents are up to the manufacturer, but it will probably contain enough information to let you determine exactly what failed (for example, the position of the modem card in the rack and the channel on the modem card).

 

Table 2-8. Generic Traps

Generic Trap Name and Number

 

Definition

 

coldStart (0)

 

Indicates that the agent has rebooted. All management variables will be reset; specifically, Counters and Gauges will be reset to zero (0). One nice thing about the coldStart trap is that it can be used to determine when new hardware is added to the network. When a device is powered on, it sends this trap to its trap destination. If the trap destination is set correctly (i.e., to the IP address of your NMS) the NMS can receive the trap and determine whether it needs to manage the device.

 

warmStart (1)

 

Indicates that the agent has reinitialized itself. None of the management variables will be reset.

 

linkDown (2)

 

Sent when an interface on a device goes down. The first variable binding identifies which interface went down.

 

linkUp (3)

 

Sent when an interface on a device comes back up. The first variable binding identifies which interface came back up.

 

authenticationFailure (4)

 

Indicates that someone has tried to query your agent with an incorrect community string; useful in determining if someone is trying to gain unauthorized access to one of your devices.

 

egpNeighborLoss (5)

 

Indicates that an Exterior Gateway Protocol (EGP) neighbor has gone down.

 

enterpriseSpecific (6)

 

Indicates that the trap is enterprise-specific. SNMP vendors and users define their own traps under the private-enterprise branch of the SMI object tree. To process this trap properly, the NMS has to decode the specific trap number that is part of the SNMP message.

 

In Chapter 1, "What Is SNMP?" we mentioned that RFC 1697 is the RDBMS MIB. One of traps defined by this MIB is rdbmsOutOfSpace :

rdbmsOutOfSpace TRAP-TYPE
    ENTERPRISE  rdbmsTraps
    VARIABLES   { rdbmsSrvInfoDiskOutOfSpaces }
    DESCRIPTION
        "An rdbmsOutOfSpace trap signifies that one of the database
         servers managed by this agent has been unable to allocate
         space for one of the databases managed by this agent. Care
         should be taken to avoid flooding the network with these traps."
    ::= 2
The enterprise is rdbmsTraps and the specific trap number is 2. This trap has one variable binding, rdbmsSrvInfoDiskOutOfSpaces. If we look elsewhere in the MIB, we will find that this variable is a scalar object. Its definition is:

 

rdbmsSrvInfoDiskOutOfSpaces OBJECT-TYPE
    SYNTAX  Counter
    ACCESS  read-only
    STATUS  mandatory
    DESCRIPTION
        "The total number of times the server has been unable to obtain
         disk space that it wanted, since server startup. This would be
         inspected by an agent on receipt of an rdbmsOutOfSpace trap."
    ::= { rdbmsSrvInfoEntry  9 }
The DESCRIPTION for this object indicates why the note about taking care to avoid flooding the network (in the DESCRIPTION text for the TRAP-TYPE) is so important. Every time the RDBMS is unable to allocate space for the database, the agent will send a trap. A busy (and full) database could end up sending this trap thousands of times a day.

 

Some commercial RDBMS vendors, such as Oracle, provide an SNMP agent with their database engines. Agents such as these typically have functionality above and beyond that found in the RDBMS MIB.