Skip Navigation Links | |
Exit Print View | |
![]() |
Oracle Solaris Studio 12.3: Performance Analyzer Oracle Solaris Studio 12.3 Information Library |
1. Overview of the Performance Analyzer
3. Collecting Performance Data
Compiling and Linking Your Program
Preparing Your Program for Data Collection and Analysis
Using Dynamically Allocated Memory
Program Control of Data Collection
The C, C++, Fortran, and Java API Functions
Limitations on Data Collection
Limitations on Clock-Based Profiling
Runtime Distortion and Dilation with Clock-profiling
Limitations on Collection of Tracing Data
Runtime Distortion and Dilation with Tracing
Limitations on Hardware Counter Overflow Profiling
Runtime Distortion and Dilation With Hardware Counter Overflow Profiling
Limitations on Data Collection for Descendant Processes
Limitations on OpenMP Profiling
Experiments for Descendant Processes
Experiments on the Kernel and User Processes
Collecting Data Using the collect Command
-h counter_definition_1...[,counter_definition_n]
Collecting Data From a Running Process Using the collect Utility
To Collect Data From a Running Process Using the collect Utility
Collecting Data Using the dbx collector Subcommands
To Run the Collector From dbx:
Experiment Control Subcommands
Collecting Data From a Running Process With dbx on Oracle Solaris Platforms
To Collect Data From a Running Process That is Not Under the Control of dbx
Collecting Tracing Data From a Running Program
Collecting Data From MPI Programs
Running the collect Command for MPI
4. The Performance Analyzer Tool
5. The er_print Command Line Performance Analysis Tool
6. Understanding the Performance Analyzer and Its Data
This section gives some guidelines for estimating the amount of disk space needed to record an experiment. The size of the experiment depends directly on the size of the data packets and the rate at which they are recorded, the number of LWPs used by the program, and the execution time of the program.
The data packets contain event-specific data and data that depends on the program structure (the call stack). The amount of data that depends on the data type is approximately 50 to 100 bytes. The call stack data consists of return addresses for each call, and contains 4 bytes per address, or 8 bytes per address on 64 bit executables. Data packets are recorded for each thread in the experiment. Note that for Java programs, there are two call stacks of interest: the Java call stack and the machine call stack, which therefore result in more data being written to disk.
The rate at which profiling data packets are recorded is controlled by the profiling interval for clock data, the overflow value for hardware counter data, and for tracing of functions, the rate of occurrences of traced functions. The choice of profiling interval parameters affects the data quality and the distortion of program performance due to the data collection overhead. Smaller values of these parameters give better statistics but also increase the overhead. The default values of the profiling interval and the overflow value have been chosen as a compromise between obtaining good statistics and minimizing the overhead. Smaller values also mean more data.
For a clock-based profiling experiment or hardware counter overflow profiling experiment with a profiling interval of about 100 samples per second, and a packet size ranging from 80 bytes for a small call stack up to 120 bytes for a large call stack, data is recorded at a rate of 10 kbytes per second per thread. Applications that have call stacks with a depth of hundreds of calls could easily record data at ten times these rates.
For MPI tracing experiments, the data volume is 100-150 bytes per traced MPI call, depending on the number of messages sent and the depth of the call stack. In addition, clock profiling is enabled by default when you use the -M option of the collect command, so add the estimated numbers for a clock profiling experiment. You can reduce data volume for MPI tracing by disabling clock profiling with the -p off option.
Note - The Collector stores MPI tracing data in its own format (mpview.dat3) and also in the VampirTrace OTF format (a.otf, a.*.z). You can remove the OTF format files without affecting the Analyzer.
Your estimate of the size of the experiment should also take into account the disk space used by the archive files, which is usually a small fraction of the total disk space requirement (see the previous section). If you are not sure how much space you need, try running your experiment for a short time. From this test you can obtain the size of the archive files, which are independent of the data collection time, and scale the size of the profile files to obtain an estimate of the size for the full-length experiment.
As well as allocating disk space, the Collector allocates buffers in memory to store the profile data before writing it to disk. Currently no way exists to specify the size of these buffers. If the Collector runs out of memory, try to reduce the amount of data collected.
If your estimate of the space required to store the experiment is larger than the space you have available, consider collecting data for part of the run rather than the whole run. You can collect data on part of the run with the collect command with -y or -t options, with the dbx collector subcommands, or by inserting calls in your program to the collector API. You can also limit the total amount of profiling and tracing data collected with the collect command with the -L option, or with the dbx collector subcommands.