by Greg Todd
One key aspect of effectively managing an Exchange server is keeping a watchful eye on the server's behavior. This chapter provides some suggestions on how to monitor Exchange Server so that you can prevent problems by catching them before they happen.
The idea is simple: if you can keep the server healthy, you can keep your users up and running. That makes everyone happy.
The main focus of this chapter is introducing you to useful tools for preventive maintenance. The following are the main tools that help you with this task:
First, you examine one of the most useful tools available to anyone monitoring an Exchange Server: Windows NT Performance Monitor (PerfMon). This tool is quite powerful, and used in the correct way, it can provide a wealth of information about the health of the server.
After you look at Performance Monitor, checking out the auto-installed PerfMon workspaces included with Exchange makes sense. These "canned" views of data are great for monitoring an Exchange server.
Then you examine some useful counters that will help you monitor the most critical aspects of an Exchange server. I include both Exchange-specific counters and generic Windows NT counters.
Next, you wrap up the chapter by looking at a few areas of monitoring and maintenance provided by the Exchange Administrator program: Server Monitor, Link Monitor, and automatic IS Maintenance. They provide additional capability for ensuring that connections and services are up and running, for keeping system clocks among Exchange servers in sync, and for keeping the IS databases clean and defragmented.
Furthermore, if you're interested in monitoring the following items in the Exchange Server system, this chapter is for you:
So get comfortable, and get ready to jump into the stimulating world of the finer points of monitoring an Exchange Server.
You can monitor many other items in an Exchange system, such as full mailboxes, messages that are too large, public folder sizes, diagnostics logging of the public and private stores, Directory Service, MTA, and so on. These topics are not covered in this chapter, but they are covered in Chapter 20, "Maintaining Exchange," and in Chapter 22, "Diagnosing the Cause of a Problem." Instead, in this chapter I want to focus on items that you can monitor by using Performance Monitor, Link Monitor, and Server Monitor. In addition, you might want to refer to the volume in the Microsoft Windows NT Resource Kit entitled, "Optimization Guide." This volume is an excellent reference on how to monitor a Windows NT system using PerfMon.
First, I want to spend some time going over the Windows NT Performance Monitor (PerfMon) application because it is an extremely valuable tool. By no means will this be a comprehensive look at PerfMon, but I do want to cover some main points. Generally, you should get some sense for how to use PerfMon and what you can do with it. For example, PerfMon gives you the ability to monitor how many messages your Exchange server is handling at any given time. You can also see how many users are logged on to the Exchange Information Store. Or what the database buffer cache hit rate is. Or how much paging activity there is in the system. All these, and more, are important data points to know when managing an Exchange Server computer.
Think of PerfMon as a wide-angle lens looking at a very large picture, the very large picture being all the performance data available in Windows NT. You can bring many things into focus with the lens, and you can look at a large number of things. Some things you can even get a fairly close look at. If, however, you really need to zoom in on one specific detail, you probably need to switch to a macro zoom lens. In that case, PerfMon might not be the right tool to use.
So although PerfMon isn't everything to everyone, chances are it will be enough to at least point you in the right direction for what to do next. And for most monitoring tasks, it will be all you need. I know it works that way for me.
The first thing to understand about PerfMon is that it relays data to you in an orderly fashion. Because you can monitor so much information in a Windows NT system, it must have some structure; otherwise, you could never get through it all.
The highest-level item on which Windows NT reports performance is called an object, and objects are organized into counters. The counters actually convey the performance of a particular part of an object.
In fact, in its architecture, Windows NT is composed of many objects that represent system resources such as processes, physical devices, and memory. So, when you use PerfMon monitor performance in a Windows NT system, you are monitoring the behavior of its objects. Figure 21.1 shows a PerfMon screen with a counter of an object's Processor, System, Thread, and Memory displayed.
Figure 21.1. Monitoring multiple counters and multiple objects is simple using PerfMon.
Table 21.1 shows some of the objects you find in Windows NT.
Object |
Contains Counters For |
Browser
|
NT network browser information
|
Cache
|
NT system cache information
|
Logical Disk
|
All logical disks in the system
|
Memory
|
System memory information
|
Objects
|
The quantity of high-level system objects such as processes, threads, semaphores, and so on
|
Paging File
|
NT page file(s)
|
Physical Disk
|
All physical disks in the system
|
Process
|
Active processes in the system; similar to Thread object
|
Processor
|
System processor(s)
|
Server
|
Miscellaneous server-related information about the system
|
System
|
Miscellaneous system-wide counters
|
Thread
|
All active threads in the system; similar to Process object |
Your system probably has the objects listed in Table 21.1, plus some others, depending on what software components have been installed. If you install TCP/IP, for example, TCP, IP, and UDP objects are available to monitor.
Also, when Exchange Server is installed, additional Exchange-specific objects are available as well. You learn more details about these objects in the section "Exchange-Specific Performance Monitor Counters," later in this chapter.
You might think that the features I've already covered are enough for anyone to wield for monitoring an Exchange server. Well, maybe so, but there's more to PerfMon. In addition to being able to view the data graphically in real time with the Chart view, you can employ the three other views to help you sort out your data. The following figures show examples of each of the four views: Chart view, Alert view, Log view, and Report view.
The Chart view, shown in Figure 21.2, is the view on which I have focused the majority of discussion so far. It is also probably the one you will spend the most time using when you are monitoring a server. Also, it's the one I tend to most often associate with PerfMon because of the graphical screen it provides.
Figure 21.2. In the Chart view, Performance Monitor graphically conveys performance counter data.
Chart view gives you a graphical real-time look at various performance counters in the system. Or you can use it for a graphical look at logged data saved previously. In Figure 21.2 you can easily see there were two spikes in the number of bytes/second on drive E: on the server \\EXCHANGE.
You set the options, such as periodic update, grid lines, legend, and so on with the Chart Options dialog box. From the PerfMon menu, choose Options | Chart to open this dialog box. Most of the options are self-explanatory.
You add counters to the view by clicking the + (plus) button on the toolbar or by choosing Edit | Add to Chart. The Add to Chart dialog box then appears, as shown in Figure 21.3. The Tab button also does the same thing, and I usually use it because it's faster.
Figure 21.3. Any available objects and counters for \\EXCHANGE are selected in the Add to Chart dialog box.
In the Counter list box, you can see all the available counters of the Processor object. By clicking the Object drop-down list box, you can see all the available objects in the system.
If you don't know what a counter represents, you can click the Explain button, which is a handy aid, in the Add to Chart dialog box. You get a brief description of the counter you've highlighted. Using this feature can be helpful on some of the more obscure counters. You might not be exactly sure, for example, what the Cache:Async MDL Reads/sec counter is. The Explain button might cast some light on the subject.
After you add a counter in the Add to Chart dialog box and click Done (the Cancel button changes to Done after you add a counter), the counter appears in the legend at the bottom of the Chart view in a chart line. The chart line gives a summary of the system resource being monitored.
If you press the backspace keyor Ctrl+Hthe line on the graph associated with the highlighted chart line changes from colored to bold white so that you can see it better. This capability is especially useful when you're working with crowded Chart views while scrolling through counters on the chart line.
As you might have noticed in Figure 21.2, several columns of characteristics about each chart line are displayed in the legend. They are specific to the Chart view. Take a closer look at them now.
Just above the chart line in the Chart view is the value bar. The value bar contains the relevant values for the highlighted chart line. It shows fields for the Last, Average, Min, Max, and Graph Time values of the highlighted chart line. If you don't see the value bar, make sure that the Legend and Value Bar check boxes are checked in the Chart Options dialog box. The numbers shown in these fields are the precise values for the counters selected.
Sometimes, reading counter data from the value bar is easier than reading it from the graph itself. The graph is great for spotting trends. The counter is best for seeing instantaneous values.
You remove counters from the view by clicking the X button on the toolbar, or by choosing Edit | Del from Chart. I usually just highlight the chart line to be deleted and press the Delete key, which is much faster.
At the bottom of the screen is the status bar. Like most status bars, this one tells what is going on in the program. In Figure 21.2, the status bar indicates that data is being graphed from the current activity in the system; that is, the data is real time. If you were viewing captured data, for example, the status bar would show the log file being used as the source (you learn more about capturing data later in this chapter in the "Logging Data" section). If the status bar is not present, you can toggle it on by choosing Options | Status Bar or pressing Ctrl+S.
Figure 21.4 shows an example of an Alert view screen. Note that it looks generally different from the Chart view, but a few things are similar. The toolbar and the status bar are the same. Alert view, however, displays data in a very different way.
Figure 21.4. In the Alert view, Performance Monitor monitors thresholds of various counters.
The Alert Legend contains the same columns as the Chart View Legend, except that Scale is replaced by Value and the color is represented as a dot rather than a line. Value shows the threshold being monitored and at what point an alert will trigger.
The Alert view is useful if you want to know if a counter exceeds or falls below a certain threshold. This way, you don't see all the individual data points, but you are notified if a counter gets out of line.
In Figure 21.4, for example, three alerts are configured. The first of the configured alerts will occur if the Processor:% Processor Time counter exceeds 80 percent. You can see that PerfMon has posted several alerts; the date, time, counter value, threshold, and counter information are all displayed on one line for each alert.
The second of the configured alerts will occur if the MSExchangeIS:Active User Count exceeds 300. This has not happened, but this alert is good to have around so that you can see when the Exchange Server is getting loaded down with users.
The third of the configured alerts will occur if the Memory:Available Bytes falls below 15 million (~15MB). Several of these alerts have occurred, and in a couple cases, the value fell close to 10MB. If the system runs short on memory, performance will degrade, so this is another example of a good use of Alert view.
You configure these thresholds in the Add to Alert dialog box. As in the Chart view, you can choose Edit | Add to Alert, press Tab, or click the + (plus) button in the toolbar to display it, as shown in Figure 21.5.
Figure 21.5. Using the Add to Alert dialog box, you can add alerts with thresholds to monitor.
This dialog box looks similar to the Add to Chart dialog box explained previously, except at the bottom of the dialog box, you can set the threshold typeOver or Underand you can enter a program to run when an alert occurs. When you are finished, press Esc or click the Done button (the Cancel button changes to Done after a counter is added).
One other noteworthy item is the Alert Interval just below the toolbar in the Alert view. The Alert Interval is how often PerfMon samples the counters to see if there is a threshold violation. It defaults to 5 seconds, but you can set it to other values. You can also configure other alert options in the Alert Options dialog box, as shown in Figure 21.6. To get there, choose Options | Alert from the PerfMon menu, or just press Ctrl+O.
Figure 21.6. You configure Alert View options in the Alert Options dialog box.
The Alert view is useful, for example, if you want to have the system notify you when a threshold violation occurs. If you enable the Network Alert section in the Alert Options dialog box, PerfMon sends a network message when a threshold is violated. You can also have PerfMon switch to the Alert view or log an event in the NT Event Log when a threshold violation occurs.
You use the Log view strictly for logging data to a disk file, not for displaying data in any way. It is quite different compared to the other viewsthe only similarity is the menus, the toolbar, and the status bar. Figure 21.7 illustrates an example of the Log view. (You learn more details relevant to capturing data later in this chapter in the section "Logging Data.")
Figure 21.7. In the Log view, Performance Monitor logs data to disk that records the activity of system objects and their counters.
This view is pretty unexciting to look at, but in Figure 21.7 data is being collected once every 60 seconds for each of the objects that belong to the computer \\EXCHANGEthe Exchange Server system.
Note that all counters for these objects are logged; you do not select specific counters to be logged, only objects. At first, this logging might seem to be overkill, but it actually proves quite handy. This way, you don't have to worry about not having data for that one counter you didn't log.
You add objects to the list using the Tab or + (plus) button, as in the other views. After you add them, however, the only thing that shows is the object name and the computer that owns the object. If you're wondering whether you can log data from multiple computers here in one centralized place, you can indeed.
The Log File text box at the top of the Log view shows which file is holding the logged data. You cannot modify it directly. Instead, you must use the Log Options dialog box. Access it by choosing Options | Log. Here, you see a common file dialog box for choosing your destination log file along with a way to specify the update interval in seconds. After you select a log file, the Start Log button becomes available. Click it, and PerfMon starts logging data. In Figure 21.7, I am logging to EXCHLOG.LOG. Notice the size of the log file is also displayed in the File Size text box.
The Status text box shows Collecting, which means data is being collected into the log file. If you look at the status bar, you also see a little cylinder and a number next to it. This serves as a visual cuewhich shows in all viewsindicating that PerfMon is logging data and the current size of the log file.
If you stop data collection (go back into the Log Options dialog box and click Stop Log), the Status text box changes to Closed, and the cylinder disappears from the status bar.
The Log Interval text box reflects the data collection time interval, in seconds, that you set in the Log Options dialog box.
Any time you need to log data on a server, you can visit this view. It is great for logging a typical day in the life of an Exchange server. After the data is captured, you can easily view and analyze it to see whether any server problems are lurking. You should play around with the Log viewlearning how to log data and manipulate logged data will make PerfMon that much more of a valuable tool.
You use the Report view, which is complementary to the Chart view, to view tabular data in real time. You can use it if you want a quick summary of counter values in tabular text format. You can view data as it is happening in real time, or you can use logged data.
When you first open a Report view, you see only a blank screenno dialog boxes, no legend, nothing. Only the menu, the toolbar, and the status bar are the same as the other views. As you add counters, they appear on the screen. Figure 21.8 shows an example.
Figure 21.8. In the Report view, Performance Monitor conveys a tabular summary of performance counter data.
The counters shown in Figure 21.8 are the same ones shown in the Chart view in Figure 21.2. These figures should give you a feel for the difference in the way the data is presented. Basically, the numbers here correspond to the numbers in the Last field of the Chart view's value bar.
Sometimes having a quick look at some data is nice, especially if a spreadsheet is not handy. Report view serves that purpose. Usually, for more formal data analysis, I use PerfMon's export feature to export my data to a spreadsheet so that I can get more detailed results.
Fortunately, as mentioned previously, PerfMon can save performance data to a disk file as it is reading it. This feature is incredibly useful, and it is quite valuable for analyzing the health of an Exchange server.
To capture data, you go to the Log view and add objects to the list for which you want to gather performance data. Then you open the Log Options dialog box and tell PerfMon which file to log the data in and what collection interval to use. Click Start Log, and you're off to the races. The process is that simple. As I mentioned before, all counters for the selected object(s) are captured.
You can start capturing data on an Exchange server during a typical day, walk away, and come back the next day. The data is captured for you, and you can export it into a CSV (comma-separated variable) or a TSV (tab-separated variable) file for easy importing into your favorite spreadsheet program. From there, you can analyze the data at your convenience to see how the server is holding up under the load.
You've logged some data. So now what? Well, a couple things come to mind.
Say you want to view a graph of the paging activity during the day when users are on the server. Presuming that you have logged the Memory object, viewing this activity will be easy. Yes, I did say the Memory object; the paging counters are not contained in the Paging File object.
First, switch PerfMon from showing current real-time data to showing logged data. To do so, choose Options | Data From to get the Data From dialog box. Click the Log File radio button, and either type in the log filename or click the ... (ellipses) button to find it.
After you select the log file, the path and filename appear in the status bar at the bottom of the screen. Make sure that you are in Chart view; then add the Memory:Pages/sec counter to the view.
At this point, all views are now using this same log file. In other words, none of your views are showing you real-time data anymore. The fact that all views are tied to the same log file is a little non-intuitive at first, but after you use PerfMon awhile, it makes sense.
A couple things are different now. First, if you capture data only for the Memory object, Memory will be the only object available in the Add to Chart dialog box. From there, you can add any counter from the Memory object to the display, and it will appear like it does when you're viewing real-time data.
Second, the graph might look funnymaybe compressed if you captured a lot of data. The default for a screen full of data in the Chart view is 100 samples. Therefore, if you sample your data at one time per second, you can get 100 seconds worth on the graph. But if your test is more than 100 seconds, you will still see all the data; it will just be crammed onto the screen in the graph.
Of course, you can get around viewing this compressed data. Just specify the time interval you want to view. Choose Edit | Time Window, and the Input Log File Timeframe dialog box appears. In this dialog box, you can zoom in to specific regions of the data.
You now need to export this data so that you can import it into your spreadsheet program and get some real work done. Choose File | Export Chart, and in the File Name box type the name of a file to export into. The format can be either CSV or TSV; you choose.
After you click OK, you have a file ready for importing into your spreadsheet program, complete with headings and everything. Any counter you have on your Chart view screen is exported into the file. So if you have only the Memory:Pages/sec counter in the view, that's all you get in the export file. This feature is one of the keys to using PerfMon.
You also can export data from the other views. When you export data from the Alert view, you get the alert data shown on the screen. When you export data from the Log view, you get a summary of the objects being logged. When you export data from the Report view, you get a file containing the summary data shown in the view.
Up to this point, I have assumed that PerfMon is capturing data on the same computer on which it's running. This doesn't have to be the case. In fact, not running PerfMon on the same computer for which you are capturing data is usually better.
Sometimes you might want to view or capture data from one or more Exchange servers on the network. Perhaps you are running an Exchange server that supports many users, and you don't want to incur the overhead of running PerfMon on that machine. Or perhaps you are monitoring a machine that is not nearby. Or maybe you just want to see how busy your neighbor's machine is because you think he's goofing off again. Accomplishing any of these tasks is easy with the remote monitoring feature of Performance Monitor.
To use the remote feature, there is (fortunately) no change to the user interface at all. Neither do you have to install or use another piece of software. You simply select the computer name when you are adding objects to be monitored.
Say, for example, you're running PerfMon on your desktop computer, and you want to use the Chart view to monitor objects on an Exchange server. In the Add to Chart dialog box, you can either type the name directly in the Computer text box, or you can click the ... (ellipsis) button at the end of the text box. A list of the computers available on the network appears. Figure 21.9 shows a Chart view of data being collected from two different computers, \\EXCHANGE and \\CALADAN.
Figure 21.9. With Performance Monitor, you can easily view counters on multiple computers at once in real time.
The interface is quite seamless. It looks exactly the same as monitoring information on a local computer. As expected, the computer name appears in the Computer column, and the other information appears in the other columns.
In a stroke of good luckor maybe good designthe same principle applies to the remaining three views as well. In the Alert view, you can view alerts from other computers. In the Log view, you can log data from other computers. And in the Report view, you can get tabular summary reports on data from other computers.
One other important feature of PerfMon is the capability to save your Performance Monitor views.
Say you've been working on getting that perfect Chart view configured so that you can monitor performance on your Exchange servers. You probably don't want to reconfigure the view each time you start PerfMon, right? Right. You can save your Chart view settings into a .PMC file for later use. Choose File | Save Chart Settings, and type a name in the File Name box.
You can choose from five different file formats, one for each of the four view types and one for the global PerfMon workspace settings.
Chart view
|
.PMC
|
Alert view
|
.PMA
|
Log view
|
.PML
|
Report view
|
.PMR
|
Workspace settings
|
.PMW |
I use .PMC the most, but of course the others are handy as well. I recommend creating a few different .PMC files that contain related information so that you can have them at your disposal.
You might, for example, create one called CPU.PMC, which contains some processor-related counters. Or you might create EXCHSRVR.PMC, which contains processor, disk, and network counters specifically selected for monitoring your Exchange server. Then you could simply load this view, and you would instantly be looking at the performance of your Exchange server.
Some .PMW files get installed when you complete the setup of Exchange Server; they appear in your Exchange Program Group automatically. They use the same principle explained here. You learn more about these files later in the chapter, in the section "Auto-Installed Performance Monitor Workspaces."
Whew! You've learned a lot about PerfMon, but there's plenty more where that came from. I encourage you to play around with it on your ownthat's the best way to learn it. I want to leave you two final bits of information, however, before you move on. I saved this information until the end of this topic because I didn't want them to get lost in the shuffle.
Performance Monitor has overhead. Yes, that's right, all this great performance information isn't for free. The impact is relatively low for such a powerful tool, but the idea is to disturb the server being monitored as little as possible. Here are a few points to think about when monitoring an Exchange server with PerfMon. Keep in mind that all these points have trade-offs, but hopefully they will get you started.
Most counters in the system begin working automatically from the time they are installed. The one major exception is any disk-related object, namely Logical Disk and Physical Disk. These counters all show zeros unless you activate disk performance measurement. You do so by entering the following line at an NT command prompt:
diskperf -y
This command activates the disk counters on the local NT computer. You can also activate disk counters remotely by entering the following:
diskperf -y \\anycomputer
This command activates the disk counters on the computer named anycomputer, provided it can be reached on the network. Full syntax of the command is available if you enter the following:
diskperf -?
Some overheadalbeit pretty minimalis associated with these counters. I suggest turning them off after you finish gathering performance data on your server to restore peak disk performance. You deactivate them by entering the following line at an NT command prompt:
diskperf -n
This overhead is especially apparent in less powerful systems, such as 80486-based models with slower disk subsystems. With most newer, more powerful machines, however, the overhead has very little discernible effect. Of course, you could always use PerfMon to measure it.
You must reboot the system for changes made with the diskperf command to take effect.
Following your enlightening run through PerfMon, you can put your newfound knowledge to work.
After you install Microsoft Exchange Server, you should be able to find a common group named Microsoft Exchange that contains several icons, as shown in Figure 21.10. You first learned about these icons in Chapter 9, "Installing Microsoft Exchange Server," but I want to revisit them in more detail.
Figure 21.10. Microsoft Exchange Server Setup automatically installs several Performance Manager icons in a common Program Manager group.
In Figure 21.10, you can see the five icons that represent PerfMon workspaces.
Remember earlier in the chapter in the "Saving Your Work" section, you learned about the five different types of PerfMon files. Now you will see one of them at work alreadythe Workspace, or .PMW file. The thoughtful engineers at Microsoft decided to provide something to help you start monitoring an Exchange server. (I'm glad they did, because it opens up a whole realm of ideas about how to monitor your server.)
The workspaces are ready-made views of various Exchange Server performance counters to help you start monitoring performance of the server. The .PMW files are located in the BIN subdirectory in which you installed Exchange Server, for example, C:\EXCHSRVR\BIN.
These workspaces are not designed to work remotely; that is, they must be run using PerfMon on the same machine as your Exchange server. You might not want to do this based on what you learned previously in this chapter in the section "Performance Monitor Has Overhead." If you want to monitor the counters remotely, you have to create your own workspaces with the counters you want pointing to the Exchange server you want.
The Health chart, shown in Figure 21.11, represents a summary of your Exchange Server's state of being. It is mainly a CPU utilization chart that shows counters for overall system CPU utilization and for how the main processes within Exchange Server are using the CPU. Paging is also included.
Figure 21.11. The Microsoft Exchange Server Health chart monitors various CPU utilization counters.
The refresh time is one second, so you use this chart to get a detailed look at your server's status. It would be a great workspace to add other relevant counters to, such as Memory:Available Bytes or System:Processor Queue Length.
If you use the Processor Queue Length counter in the System object, don't forget that you also have to monitor a counter in the Thread object to activate it. Any thread counter will do; just pick one.
The following is a summary of the counters:
The Load chart, shown in Figure 21.12, is slightly different from Health. Although the load on your server ultimately can affect its health, this workspace gives insight into how much traffic your server is managing.
Figure 21.12. The Microsoft Exchange Server Load chart monitors traffic on your server.
This chart only scratches the surface of showing how much load is on your server; you also can look at many more counters, such as Logical Disk, CPU, Network Interface, and so on. You can use this chart as a quick view into generic server load. The update interval is a bit longer10 secondsso you get a bigger picture of how the server's load is going over the last 5 to 10 minutes.
When you open these Performance Monitor workspaces, they have no menu bar. Double-click in the graph area (or just press Enter) to make it appear. You can resize the window, check chart options, or whatever to see what's being depicted on the chart.
The following is a summary of the counters:
The History chart, shown in Figure 21.13, is designed to give you a high-level look at what your server has been doing over the past 100 minutes. This chart mainly shows message activity in the public and private stores along with user count, MTA work queue, and amount of paging.
Figure 21.13. The Microsoft Exchange Server History chart monitors traffic on your server.
This graph is a good one to bring up and let sit so that you can glance at it periodically. Watching the Work Queue Length of the MTA is very useful if you have multiple Exchange servers sending mail to each other. Also, the counters are excellent ones to monitor on a more frequent basis if you need finer granularity than 60-second intervals. I regularly watch User Count, MTA Work Queue Length, and Pages/sec.
All the PerfMon workspaces covered in this section use the Chart view.
The following is a summary of the counters:
The Users chart, shown at the left in Figure 21.14, is a simple PerfMon chart that shows only the number of machines connected to Exchange Server. This workspace takes on a slightly different form than the others in that it appears as a bar graph.
Figure 21.14. The Microsoft Exchange Server Users and Queues charts monitor the number of users and the critical work queues on your server.
Having this graph open in the corner of the screen is useful for giving you a quick visual indicator of how many users are on the server at any given time. The refresh interval is 10 seconds; you might want it shorter.
The Users and Queues charts have the Histogram option enabled rather than the Graph option. The data therefore displays as bars rather than as graph lines. You set the Histogram option in the Chart Options dialog box.
The only counter in this chart is MSExchangeIS:User Count, which was explained in the earlier section, "Microsoft Exchange Server History".
The Queues chart, shown at the right in Figure 21.14, shows five major queues in Microsoft Exchange Server. As with the Users chart, the data is depicted as a bar graph.
These counters are critical to system performance, and if any of them start to become backlogged, you will notice a delay in delivery of messages accompanied by a drop in user response time. The slowdown also probably indicates a bottleneck somewhere in the system.
This graph is a good one to open periodically to get an instant check on the status of the system queues. The window doesn't show what queue each bar represents, but if you enlarge the window, you can see the legend at the bottom. If any of these queues starts growing steadily, or stays above the 1050 range without coming back down, you should start investigating why.
The following is a summary of the counters, corresponding to the bars from left to right:
Three other workspace files are located in the \EXCHSRVR\BIN directory in case you are using the Internet Mail Connector (IMC). These three basically show the flow of messages through the IMC and associated counters. I'm not going to go through them here because they are specific to the IMC. The principles of their use, however, are identical to the ones you just examined.
If you have not installed the Exchange IMC, the MSExchangeIMC PerfMon object is not installed; therefore, these counters will not even function.
Some applications have their own performance objects. They can make your life much easier when you're tracking performance or system behavior. Table 21.2 lists the Exchange-specific objects that are automatically installed in Performance Monitor during setup. These are all useful in one way or another, and they provide a wealth of information on the behavior of your Exchange Server computer. I encourage you to explore them on your own and get familiar with them.
Object Name
|
Description
|
MSExchangeDB
|
Object that contains counters pertaining to the Exchange Server database engine (DB)
|
MSExchangeDS
|
Object that contains counters pertaining to the Exchange Directory Service (DS)
|
MSExchangeIS
|
Object that contains general counters for the Exchange Server Information Store (IS)
|
MSExchangeISPrivate
|
Object that contains counters pertaining to the Exchange Private Information Store
|
MSExchangeISPublic
|
Object that contains counters pertaining to the Exchange Public Information Store
|
MSExchangeMTA
|
Object that contains counters pertaining to the Exchange Mail Transfer Agent (MTA)
|
MSExchangeMTA Connections
|
Connections object that contains counters pertaining to connections to the MTA |
If you installed the IMC, the MSExchangeIMC PerfMon object is installed. If you installed the MS Mail Connector, the MSExchangeMSMI and MSExchangePCMTA objects are installed.
Dozens of Exchange Server counters are included in these seven objects. Table 21.3 contains some useful counters to get you going.
Object Name |
Counter |
Description |
MSExchangeDB
|
% Buffer Available
|
(Instance=Information Store) The percentage of the database buffer cache that is available for use. This counter and the following one help monitor how effective your database buffer cache is.
|
MSExchangeDB
|
% Buffer Cache Hit
|
(Instance=Information Store) The percentage of requests for store data that were satisfied from the database buffer cache.
|
MSExchangeIS
|
User Count
|
The number of users connected to the store.
|
MSExchangeISPrivate
|
Messages Submitted/min
|
The rate messages are being submitted by clients. If the rate is consistently higher than Messages Delivered/min, the server might not be able to keep up with the delivery load.
|
MSExchangeISPrivate
|
Messages Delivered/min
|
The rate messages are delivered to all recipients. If the rate is consistently lower than Messages Submitted/min, the server might not be able to keep up with delivery load.
|
MSExchangeISPrivate
|
Send Queue Size
|
Number of messages in the send queue. It is another counter that can indicate when the server is overloaded.
|
MSExchangeISPublic
|
(same counters as MSExchangeISPrivate)
|
Following are some generic PerfMon objects and counters that are installed as a part of Windows NT. Table 21.4 contains a list of some of these generic objects that I have found useful for monitoring an Exchange Server.
Object Name |
Description |
Cache
|
Object that contains counters pertaining to the Windows NT System Cache
|
Logical Disk
|
Object that contains counters pertaining to the logical disk drives in the system
|
Memory
|
Object that contains counters pertaining to memory usage in the system
|
Paging File
|
Object that contains counters pertaining to the status of the page file
|
Processor
|
Object that contains counters pertaining to the system processor(s)
|
Server
|
Object that contains general counters pertaining to the server service of the system
|
System
|
Object that contains general counters pertaining to the operating system |
Again, dozens of counters are included in these objects. Table 21.5 contains some useful counters to get you going.
Object Name |
Counter |
Description |
Cache
|
Data Map Hits %
|
The percentage of successful references to the in-memory system data cache.
|
Logical Disk
|
% Disk Time
|
The percentage of time the disk is busy servicing I/O requests. This is basically the percentage of time during the sample period the Disk Queue Length is greater than zero.
|
Logical Disk
|
Avg. Disk sec/Transfer
|
The average number of seconds it takes the disk to satisfy a disk transfer (read or write).
|
Logical Disk
|
Disk Bytes/sec
|
The rate at which data is transferred to or from the disk during I/O operations.
|
Memory
|
Available Bytes
|
The amount of virtual memory in the system available for use.
|
Memory
|
Cache Bytes
|
Size of the NT System Cache. Note that the system cache is for both disk and LAN.
|
Memory
|
Pages/sec
|
The overall paging activity; that is, the rate at which pages are written to or read from the disk. If this number is high, you should increase memory or rerun the Exchange Optimizer.
|
Paging File
|
% Usage
|
Shows what the percentage of the page file is in use. It could indicate if you need to increase your page file.
|
Processor
|
% Processor Time
|
Amount of time the processor is busy doing work. This is User and Privileged time combined.
|
Server
|
Bytes Total/sec
|
The rate at which the server is sending data to and receiving data from the network.
|
System
|
Context Switches/sec
|
The rate at which the system is switching from one thread to another. If this counter and System Calls/sec are excessive (greater than 10,000), it can indicate that the software is thrashing. This type of thrashing can happen in multiprocessor systems.
|
System
|
System Calls/sec
|
The rate at which calls to Windows NT system service routines are made. This rate is one indication of NT system overhead. If this counter and Context Switches/sec are excessive (greater than 10,000), it can indicate that the software is thrashing. This type of thrashing can happen in multiprocessor systems.
|
System
|
Processor Queue Length
|
The number of threads waiting in the processor queue. Values consistently above 2 can indicate processor congestion. (You must also monitor at least one thread from the Thread object for this counter to be non-zero.) |
Be careful how you interpret the Logical Disk:% Disk Time counter, especially when using hardware RAID sets. Because an array can handle multiple I/Os simultaneously, having some I/Os queued and waiting for the array is not bad. This counter shows 100 percent busy in that scenario, however, when in fact the disk subsystem might not be 100 percent busy. A more reliable indicator is Disk Reads/sec and Disk Writes/sec. You should be able to calculate how many reads and writes per second your array is capable of and compare that number with what PerfMon reports.
Now you're ready to move on from the PerfMon tool. Although the Performance Monitor is quite powerful, other resources available in Exchange can help you monitor how the server is doing.
Because the Exchange Server documentation covers this subject nicely in the Administrator's Guide, Chapter 16, "Monitoring Your Organization," I'm going to keep this discussion brief. I do, however, want you to know what monitors are so that you can dig into them further.
You can use the two basic types of monitors:
You can find monitors in the Exchange Administrator, under the organization\site\Configuration\Monitors object.
You can use monitors to configure notifications, synchronize Exchange Servers' clocks, monitor NT services, and so on.
After you configure a monitor, it does not automatically start working; you must activate it. You do so by highlighting the monitor and then choosing Tools | Start Monitor from the Exchange Administrator. A child window within the Administrator then appears, showing the status of the monitor you just activated. Monitors can also be started automatically if you place the Administrator program in the NT Startup group and use the /m parameter with ADMIN.EXE. Refer to Appendix B, "Command Reference" for details on using parameters with ADMIN.EXE.
You can use a Server monitor to do the following:
To create a new Server monitor, highlight the Monitors object, and choose File | New Other | Server Monitor from the Exchange Administrator. A properties dialog box similar to the one shown in Figure 21.15 then appears.
Figure 12.15. The Server monitor is configured with both a directory name and a display name.
From here, configuring what you want the monitor to do is easy. On the General tab, you specify the monitor's name, log file, and polling intervals.
On the Notification tab, you specify the type of notification to occur when a server goes into a warning or alert state. The following are the three types of notification:
On the Servers tab, you specify which Exchange Servers to monitor.
On the Actions tab, you specify actions to take when a service is stopped on a monitored server. You can have the monitor do one of three things:
Finally, on the Clock tab, you specify the tolerances for the monitored clock. This setting determines how far out of sync the monitored computer's system clock can be with the monitor. You can also optionally have the monitored server's clock forced in sync with the monitor.
If you have several Exchange server computers, it is useful to keep the clocks synchronized. One reason is that if something happens anywhere on your Exchange network, you can easily track down the sequence of events knowing the clocks are all in sync between the various servers. Another reason is that if you have time-sensitive tasks to be executed on the serverssuch as backup or other types of maintenanceyou can be sure all the servers have the correct time on them.
You can use a Link monitor to ensure that the message transport mechanism between two servers is working properly. It does so by sending a message via the MTA to the other server. Then it checks to see whether the message made the trip successfully.
A Link monitor can monitor round-trip time for messages sent from one Exchange server to another or from an Exchange server to a foreign messaging system. More often, it is for the latter case.
If the message takes longer than expected to make the round tripcalled the bounce durationa notification occurs. The type of notification depends on how you configure notifications. Basically, they're just like Server monitor notifications.
To create a new Link monitor, highlight the Monitors object and then choose File | New Other | Link Monitor from the Exchange Administrator. A properties dialog box similar to the one in Figure 21.16 then appears.
Figure 12.16. The Link monitor is configured with both a directory name and a display name.
As with the Server monitor, configuring what you want the monitor to do is easy.
Wouldn't it be handy if the IS could automatically perform its own maintenance and defragmentation? Turns out you're in luck this timeit can be done. Any Exchange server can be configured to perform online maintenance at specified intervals. You can schedule the maintenance to occur at whatever time of day or night you prefer.
This capability is useful because as the IS gets pounded, over time it might start to exhibit diminished response time. Running daily automatic maintenance on the IS helps prevent this situation from happening because the database is kept fresh and defragmented.
You access this feature via the Exchange Administrator program. The configuration is located on the property page for \organization\site\Configuration\Servers\servername. Figure 21.17 shows the IS Maintenance tab.
Figure 21.17. The IS Maintenance server feature is useful for automatically keeping the database store in good shape.
In Figure 21.17, the maintenance is scheduled to start sometime between 2 a.m. and 7 a.m. every day of the week. Using this schedule is a good idea for two reasons. First, the maintenance is scheduled for a slow time of the day. Second, the maintenance is done once daily, which is about how often you should perform this maintenance anyway.
Unlike EDBUTIL's defragmenter, with IS Maintenance the entire store (public and private databases) is defragmented while still online; that is, the MSExchangeIS service runs the entire time. The following are two caveats:
In this chapter, you learned about major tools and system resources to assist in monitoring Exchange Server computers, such as Windows NT Performance Monitor; auto-installed Exchange PerfMon workspaces; Exchange-specific PerfMon counters; Windows NT PerfMon counters; Microsoft Exchange Server Monitors; and Exchange Server Information Store maintenance.
These items help you monitor servers that have exceeded thresholds, that have services which have stopped running, that have failed connections, that have server clocks which are out of sync, and much more.
The topic of monitoring an Exchange server is large, as you can see, but you have covered a significant chunk of it here. I hope that this information is enough to give you some useful ideas about how to proceed with monitoring your own servers.
With this information as your foundation, you are better prepared to move into the next chapter, which covers a related topic, "Diagnosing the Cause of a Problem." There I extend some of the ideas presented in this chapter to assist you in figuring out what has happened when something goes wrong with an Exchange server system.