15 Performance Considerations
WARNING:
Oracle Linux 7 is now in Extended Support. See Oracle Linux Extended Support and Oracle Open Source Support Policies for more information.
Migrate applications and data to Oracle Linux 8 or Oracle Linux 9 as soon as possible.
For more information about DTrace, see Oracle Linux: DTrace Release Notes and Oracle Linux: Using DTrace for System Tracing.
DTrace creates additional work in the system. Therefore, enabling DTrace always affects system performance in some way. Often, this effect is negligible, but it can become substantial if many probes with significant enablings are enabled. This chapter describes some techniques for minimizing the performance effect of DTrace.
Limit Enabled Probes
      Dynamic instrumentation techniques enable DTrace to provide
      unparalleled tracing coverage of the kernel and arbitrary user
      processes. While this coverage provides revolutionary new insight
      into system behavior, it also can cause enormous probe effect. If
      tens of thousands or hundreds of thousands of probes are enabled,
      the effect on the system can easily be substantial. Therefore, you
      should only enable as many probes as you need to solve a problem.
      For example, you should not enable all syscall
      probes if a more concise enabling can answer your question. Your
      question might require that you concentrate on a specific module
      of interest or a specific function.
    
                  
Caution:
        When using the pid provider, be especially
        careful. Because the pid provider can
        instrument every instruction, you could enable millions of
        probes in an application and therefore slow the target process
        to a crawl.
      
                     
You can also use DTrace in situations where large numbers of probes must be enabled to answer a question. Enabling a large number of probes might slow down the system significantly, but it never induces fatal failure on the system. You should therefore not hesitate to enable many probes, if so required.
Using Aggregations
As discussed in Aggregations, DTrace aggregations provide a scalable way to aggregate data. Associative arrays might appear to offer functionality that is similar to aggregations, but because general-purpose variables are global by nature, associative arrays cannot offer the linear scalability of aggregations. Therefore, the preference is to use aggregations over associative arrays whenever possible. For example, the following D program uses an associative array to aggregate data:
syscall:::entry
{
  totals[execname]++;
}
syscall::rexit:entry
{
  printf("%40s %d\n", execname, totals[execname]);
  totals[execname] = 0;
}Whereas, the following D program is preferred, as it uses an aggregation to achieve the same result:
syscall:::entry
{
  @totals[execname] = count();
}
END
{
  printa("%40s %@d\n", @totals);
}Using Cacheable Predicates
      You use DTrace predicates to filter unwanted data from the
      experiment by tracing data only if a specified condition is found
      to be true. When enabling many probes, you generally use
      predicates of a form that identifies a specific thread, or threads
      of interest, such as /self->traceme/ or
      /pid == 12345/. Although many of these
      predicates evaluate to a false value for most threads in most
      probes, the evaluation itself can become costly when done for many
      thousands of probes. To reduce this cost, DTrace caches the
      evaluation of a predicate if it includes only thread-local
      variables, such as /self->traceme/, or for
      immutable variables, such as /pid == 12345/.
      The cost of evaluating a cached predicate is much less than the
      cost of evaluating a non-cached predicate, especially if the
      predicate involves thread-local variables, string comparisons, or
      other relatively costly operations. While predicate caching is
      transparent to the user, it does require some guidelines for
      constructing optimal predicates. Some guidelines for constructing
      optimal predicates are outlined in the following table.
    
                  
| Cacheable | Uncacheable | 
|---|---|
| 
                                     | 
                                     
                                     
                                     | 
| 
                                     | 
                                     
                                     | 
| 
                                     | 
                                     
                                     | 
| 
                                     | 
                                     
                                     | 
| 
                                     | 
                                     
                                     
                                     | 
The following example uses an associative array in the predicate and is not cacheable:
syscall::read:entry
{
  follow[pid, tid] = 1;
}
lockstat:::
/follow[pid, tid]/
{}
syscall::read:return
/follow[pid, tid]/
{
  follow[pid, tid] = 0;
}Using a cacheable, thread-local variable, per the following example, is preferable:
syscall::read:entry
{
  self->follow = 1;
}
lockstat:::
/self->follow/
{}
syscall::read:return
/self->follow/
{
  self->follow = 0;
}For a predicate to be cacheable, it must consist exclusively of cacheable expressions. All of the following predicates all cacheable:
/execname == "myprogram"/ /execname == $$1/ /pid == 12345/ /pid == $1/ /self->traceme == 1/
The following examples, which use global variables, are not cacheable:
/execname == one_to_watch/ /traceme[execname]/ /pid == pid_i_care_about/ /self->traceme == my_global/