Ask the Wizard Questions
The Question is:
How to deal with high interrupt mode time?
Both DECamds and DECPS Performance Analyzer are great for warning
one about too much time being spent in interrupt mode, but neither
they nor most of the documentation I've seen suggest how to pinpoint
the cause. I would assume that using monitor to look at buffered IO,
SCS and DLOCK would be good places to start, but, again, the docs do
not seem to reveal what numbers are considered too high for a VAX 6410
serving HSC disks to satellites. Is there a good performance cookbook
or DSNLINK article that explains a procedure to track down the
specific cause for this? If not, perhaps this would be a good place
to put one. Thanks,
The Answer is:
The Guide to OpenVMS Performance Management is the best place to
look for help with questions like this. Section 220.127.116.11.1,
Interpreting MONITOR MODES data, is a good place to start for
Statements like "too much time" and "too high" are highly dependent on
*your* typical application workload, and on your plans for future
growth. If your 6410 system's primary job is to serve HSC disks to
satellites, the time in interrupt mode reported by DECamds and DECPS
may be typical for your workload.
Disk serving by the MSCP server happens at high IPL. The more
disk I/O to served disks your satellites do, the more time the
server will spend on the interrupt stack. Looking at buffered I/O,
SCS, DLOCK and MSCP with MONITOR will give you a very good idea
where the interrupt stack time is spent.
Once you know where the time is going, then you have to ask yourself
if it's "excessive". There's no "rule" here, it depends on what other
uses you have or have planned for the server system. Does the server
have other work that's not getting done? Do you plan to add more
satellites, and does the server have the capacity to handle any more?
You may find that you don't have to do anything, if your satellites
are happy, and you don't have any other problems.
On the other hand you may need to add capacity (another server, another
CPU, etc.) if you plan to grow. You may want to track down "hot files"
and figure out a way to move them closer to the satellites (which will
improve performance for both the server *and* the satellites). These
are just some suggestions, and without knowing more about your
configuration, workload and plans I can't give more details.
Since you have the POLYCENTER Performance Data Collector you can use the
System PC Collector to get an idea of what pieces of code are being
exercised during the times of high interrupt stack usage.
When the system is experiencing the problem
$ ADVISE COLLECT SYSTEM_PC
This will collect for 15 minutes into the file SYSTEM.PCS
$ ADVISE COLLECT REPORT SYSTEM_PC SYSTEM.PCS/OUTPUT=file.ext
Now examine the section of the report titled 'PC Samples by System Image'
and pay particular attention to the Filter Samples column. This will be
the number of samples in that System Image while the system was on the