2 The D Programming Language
WARNING:
Oracle Linux 7 is now in Extended Support. See Oracle Linux Extended Support and Oracle Open Source Support Policies for more information.
Migrate applications and data to Oracle Linux 8 or Oracle Linux 9 as soon as possible.
For more information about DTrace, see Oracle Linux: DTrace Release Notes and Oracle Linux: Using DTrace for System Tracing.
The D systems programming language enables you to interface with
operating system APIs and with the hardware. This chapter formally
describes the overall structure of a D program and the various
features for constructing probe descriptions that match more than
one probe. The chapter also discusses the use of the C
preprocessor, cpp
, with D programs.
D Program Structure
A D program, also known as a script, consists of a set of clauses that describe the probes to enable and the predicates and actions to bind to these probes. D programs can also contain declarations of variables and definitions of new types. See Variables and Type and Constant Definitions for more details.
Probe Clauses and Declarations
As shown in the examples in this guide thus far, a D program source file consists of one or more probe clauses that describe the instrumentation to be enabled by DTrace. Each probe clause uses the following general form:
probe descriptions / predicate / { action statements }
Note that the predicate and list of action statements may be
omitted. Any directives that are found outside of probe clauses
are referred to as declarations.
Declarations may only be used outside of probe clauses. No
declarations are permitted inside of the enclosing braces
({}
). Also, declarations may not be
interspersed between the elements of the probe clause in
previous example. You can use white space to separate any D
program elements and to indent action statements.
Declarations can be used to declare D variables and external C
symbols or to define new types for use in D. For more details,
see Variables and
Type and Constant Definitions. Special D
compiler directives, called pragmas, may
also appear anywhere in a D program, including outside of probe
clauses. D pragmas are specified on lines beginning with a
#
character. For example, D pragmas are used
to set DTrace runtime options. See
Options and Tunables for more details.
Probe Descriptions
Every program clause begins with a list of one or more probe descriptions, each taking the following usual form:
provider:module:function:name
If one or more fields of the probe description are omitted, the
specified fields are interpreted from right to left by the D
compiler. For example, the probe description
foo:bar
would match a probe with the function
foo
and name bar, regardless of the value of
the probe's provider and module fields. Therefore, a probe
description is really more accurately viewed as a
pattern that can be used to match one or
more probes based on their names.
You should write your D probe descriptions specifying all four
field delimiters so that you can specify the desired
provider on the left-hand side. If you
don't specify the provider, you might obtain unexpected results
if multiple providers publish probes with the same name.
Similarly, subsequent versions of DTrace might include new
providers with probes that unintentionally match your partially
specified probe descriptions. You can specify a provider but
match any of its probes by leaving any of the module, function,
and name fields blank. For example, the description
syscall:::
can be used to match every probe
that is published by the DTrace syscall
provider.
Probe descriptions also support a pattern-matching syntax
similar to the shell globbing pattern
matching syntax that is described in the
sh(1)
manual page. Before matching a probe to
a description, DTrace scans each description field for the
characters *
, ?
, and
[
. If one of these characters appears in a
probe description field and is not preceded by a
\
, the field is regarded as a pattern. The
description pattern must match the entire corresponding field of
a given probe. To successfully match and enable a probe, the
complete probe description must match on every field. A probe
description field that is not a pattern must exactly match the
corresponding field of the probe. Note that a description field
that is empty matches any probe.
The special characters in the following table are recognized in probe name patterns.
Table 2-1 Probe Name Pattern Matching Characters
Symbol | Description |
---|---|
|
Matches any string, including the null string. |
|
Matches any single character. |
|
Matches any one of the enclosed characters. A pair of
characters separated by |
|
Interpret the next character as itself, without any special meaning. |
Pattern match characters can be used in any or all of the four
fields of your probe descriptions. You can also use patterns to
list matching probes by them on the command line by using the
dtrace -l command. For example, the
dtrace -l -f kmem_* command lists all of the
DTrace probes in functions with names that begin with the prefix
kmem_
.
If you want to specify the same predicate and actions for more
than one probe description, or description pattern, you can
place the descriptions in a comma-separated list. For example,
the following D program would trace a timestamp each time probes
associated with entry to system calls containing the strings
“read
” or “write
”
fire:
syscall::*read*:entry, syscall::*write*:entry { trace(timestamp); }
A probe description can also specify a probe by using its
integer probe ID, for example, the following clause could be
used to enable probe ID 12345
, as reported by
dtrace -l -i 12345:
12345 { trace(timestamp); }
Note:
You should always write your D programs using human-readable probe descriptions. Integer probe IDs are not guaranteed to remain consistent as DTrace provider kernel modules are loaded and unloaded or following a reboot.
Clause Predicates
Predicates are expressions that are enclosed in a pair of
slashes (//
) that are then evaluated at probe
firing time to determine whether the associated actions should
be executed. Predicates are the primary conditional construct
that are used for building more complex control flow in a D
program. You can omit the predicate section of the probe clause
entirely for any probe. In which case, the actions are always
executed when the probe fires.
Predicate expressions can use any of the D operators and can refer to any D data objects such as variables and constants. The predicate expression must evaluate to a value of integer or pointer type so that it can be considered as true or false. As with all D expressions, a zero value is interpreted as false and any non-zero value is interpreted as true.
Probe Actions
Probe actions are described by a list of statements that are
separated by semicolons (;
) and enclosed in
braces ({}
). An empty set of braces with no
statements included, leads to the default actions, which are to
print the CPU and the probe.
Order of Execution
The actions for a probe are executed in program order, regardless of whether those actions are in the same clause or in different clauses.
No other ordering constraints are imposed. It is not uncommon for the output from two distinct probes to appear interspersed or in an opposite order from which the probes fired. Also, output might appear misordered if it came from different CPUs.
Use of the C Preprocessor
The C programming language that is used for defining Linux system interfaces includes a preprocessor that performs a set of initial steps in C program compilation. The C preprocessor is commonly used to define macro substitutions, where one token in a C program is replaced with another predefined set of tokens, or to include copies of system header files. You can use the C preprocessor in conjunction with your D programs by specifying the dtrace command with the -c option. This option causes the dtrace command to execute the cpp preprocessor on your program source file and then pass the results to the D compiler. The C preprocessor is described in more detail in The C Programming Language by Kernighan and Ritchie, details of which are referenced in Preface.
The D compiler automatically loads the set of C type descriptions that is associated with the operating system implementation. However, you can use the preprocessor to include other type definitions such as the types that are used in your own C programs. You can also use the preprocessor to perform other tasks such as creating macros that expand to chunks of D code and other program elements. If you use the preprocessor with your D program, you may only include files that contain valid D declarations. The D compiler can correctly interpret C header files that include only external declarations of types and symbols. However, the D compiler cannot parse C header files that include additional program elements, such as C function source code, which produces an appropriate error message.
Compilation and Instrumentation
When you write traditional programs, you often use a compiler to
convert your program from source code into object code that you
can execute. When you use the dtrace command
you are invoking the compiler for the D language that was used in
a previous example to write the hello.d
program. When your program is compiled, it is sent into the
operating system kernel for execution by DTrace. There, the probes
named in your program are enabled and the corresponding provider
performs whatever instrumentation is required in order to activate
them.
All of the instrumentation in DTrace is completely dynamic: probes are enabled discretely only when you are using them. No instrumented code is present for inactive probes, so your system does not experience any kind of performance degradation when you are not using DTrace. After your experiment is complete and the dtrace command exits, all of the probes that you used are automatically disabled and their instrumentation is removed, returning your system to its exact original state. No effective difference exists between a system where DTrace is not active and a system where the DTrace software is not installed, other than a few megabytes of disk space that is required for type information and for DTrace itself.
The instrumentation for each probe is performed dynamically on the live, running operating system or on user processes that you select. The system is not quiesced or paused in any way and instrumentation code is added only for the probes that you enable. As a result, the probe effect of using DTrace is limited to exactly what you direct DTrace to do: no extraneous data is traced and no one, big “tracing switch” is turned on in the system. All of the DTrace instrumentation is designed to be as efficient as possible. These features enable you to use DTrace in production to solve real problems in real time.
The DTrace framework also provides support for an arbitrary number of virtual clients. You can run as many simultaneous DTrace experiments and commands as you like, limited only by your system's memory capacity. The commands all operate independently using the same underlying instrumentation. This same capability also permits any number of distinct users on the system to take advantage of DTrace simultaneously: developers, administrators, and service personnel can all work together, or on distinct problems, using DTrace on the same system without interfering with one another.
Unlike programs that are written in C and C++, and similar to programs that are written in the Java programming language, DTrace D programs are compiled into a safe, intermediate form that is used for execution when your probes fire. This intermediate form is validated for safety when your program is first examined by the DTrace kernel software. The DTrace execution environment also handles any runtime errors that might occur during your D program's execution, including dividing by zero, dereferencing invalid memory, and so on, and reports them to you. As a result, you can never construct an unsafe program that would cause DTrace to inadvertently damage the operating system kernel or one of the processes running on your system. These safety features enable you to use DTrace in a production environment without being concerned about crashing or corrupting your system. If you make a programming mistake, DTrace reports the error to you and disables your instrumentation, enabling you to correct the mistake and try again. The DTrace error reporting and debugging features are described later in this guide.
Figure 2-* shows the different components of the DTrace architecture.
Overview of the DTrace Architecture and Components

Now that you understand how DTrace works, let us return to the tour of the D programming language and start writing some more interesting programs.
Variables and Arithmetic Expressions
Our next example program makes use of the DTrace
profile
provider to implement a simple
time-based counter. The profile provider is able to create new
probes based on the descriptions found in your D program. If you
create a probe named
profile:::tick-
n
sec
for some integer n, the profile
provider creates a probe that fires every
n seconds. Type the following source
code and save it in a file named counter.d
:
/* * Count off and report the number of seconds elapsed */ dtrace:::BEGIN { i = 0; } profile:::tick-1sec { i = i + 1; trace(i); } dtrace:::END { trace(i); }
When executed, the program counts off the number of elapsed
seconds until you press Ctrl-C
, and then prints
the total at the end:
# dtrace -s counter.d dtrace: script 'counter.d' matched 3 probes CPU ID FUNCTION:NAME 1 638 :tick-1sec 1 1 638 :tick-1sec 2 1 638 :tick-1sec 3 1 638 :tick-1sec 4 1 638 :tick-1sec 5 1 638 :tick-1sec 6 1 638 :tick-1sec 7 ^C 1 638 :tick-1sec 8 0 2 :END 8
The first three lines of the program are a comment to explain what
the program does. Similar to C, C++, and the Java programming
language, the D compiler ignores any characters between the
/*
and */
symbols. Comments
can be used anywhere in a D program, including both inside and
outside your probe clauses.
The BEGIN
probe clause defines a new variable
named i
and assigns it the integer value zero
using the statement:
i = 0;
counter.d
, the variable
i
is first assigned the integer constant zero,
so its type is set to int
. D provides the same
basic integer data types as C, including those in the following
table.
Data Type | Description |
---|---|
|
Character or single byte integer |
|
Default integer |
|
Short integer |
|
Long integer |
|
Extended long integer |
The sizes of these types are dependent on the operating system kernel's data model, described in Types, Operators, and Expressions. D also provides built-in friendly names for signed and unsigned integer types of various fixed sizes, as well as thousands of other types that are defined by the operating system.
The central part of counter.d
is the probe
clause that increments the counter i
:
profile:::tick-1sec { i = i + 1; trace(i); }
This clause names the probe
profile:::tick-1sec
, which tells the
profile
provider to create a new probe that
fires once per second on an available processor. The clause
contains two statements, the first incrementing
i
, and the second tracing (printing) the new
value of i
. All the usual C arithmetic
operators are available in D. For the complete list, see
Types, Operators, and Expressions. The trace
function takes any D expression as its argument, so you could
write counter.d
more concisely as follows:
profile:::tick-1sec { trace(++i); }
If you want to explicitly control the type of the variable
i
, you can surround the desired type in
parentheses when you assign it in order to
cast the integer zero to a specific type. For
example, if you wanted to determine the maximum size of a
char
in D, you could change the
BEGIN
clause as follows:
dtrace:::BEGIN { i = (char)0; }
After running counter.d for a while, you should
see the traced value grow and then wrap around back to zero. If
you grow impatient waiting for the value to wrap, try changing the
profile
probe name to
profile:::tick-100msec
to make a counter that
increments once every 100 milliseconds, or 10 times per second.
Predicate Examples
For runtime safety, one major difference between D and other
programming languages such as C, C++, and the Java programming
language is the absence of control-flow constructs such as
if
-statements and loops. D program clauses are
written as single straight-line statement lists that trace an
optional, fixed amount of data. D does provide the ability to
conditionally trace data and modify control flow using logical
expressions called predicates. A predicate
expression is evaluated at probe firing time prior to executing
any of the statements associated with the corresponding clause. If
the predicate evaluates to true, represented by any non-zero
value, the statement list is executed. If the predicate is false,
represented by a zero value, none of the statements are executed
and the probe firing is ignored.
Type the following source code for the next example and save it in
a file named countdown.d
:
dtrace:::BEGIN { i = 10; } profile:::tick-1sec /i > 0/ { trace(i--); } profile:::tick-1sec /i == 0/ { trace("blastoff!"); exit(0); }
This D program implements a 10-second countdown timer using predicates. When executed, countdown.d counts down from 10 and then prints a message and exits:
# dtrace -s countdown.d dtrace: script 'countdown.d' matched 3 probes CPU ID FUNCTION:NAME 0 638 :tick-1sec 10 0 638 :tick-1sec 9 0 638 :tick-1sec 8 0 638 :tick-1sec 7 0 638 :tick-1sec 6 0 638 :tick-1sec 5 0 638 :tick-1sec 4 0 638 :tick-1sec 3 0 638 :tick-1sec 2 0 638 :tick-1sec 1 0 638 :tick-1sec blastoff! #
This example uses the BEGIN
probe to initialize
an integer i
to 10 to begin the countdown.
Next, as in the previous example, the program uses the
tick-1sec
probe to implement a timer that fires
once per second. Notice that in countdown.d
,
the tick-1sec
probe description is used in two
different clauses, each with a different predicate and action
list. The predicate is a logical expression surrounded by
enclosing slashes //
that appears after the
probe name and before the braces {}
that
surround the clause statement list.
The first predicate tests whether i
is greater
than zero, indicating that the timer is still running:
profile:::tick-1sec /i > 0/ { trace(i--); }
The relational operator >
means
greater than and returns the integer value
zero for false and one for true. All of the C relational operators
are supported in D. For the complete list, see
Types, Operators, and Expressions. If i
is not
yet zero, the script traces i
and then
decrements it by one using the --
operator.
The second predicate uses the ==
operator to
return true when i
is exactly equal to zero,
indicating that the countdown is complete:
profile:::tick-1sec /i == 0/ { trace("blastoff!"); exit(0); }
Similar to the first example, hello.d
,
countdown.d
uses a sequence of characters
enclosed in double quotes, called a string
constant, to print a final message when the countdown
is complete. The exit
function is then used to
exit dtrace and return to the shell prompt.
If you look back at the structure of
countdown.d
, you will see that by creating two
clauses with the same probe description but different predicates
and actions, we effectively created the logical flow:
i = 10 once per second, if i is greater than zero trace(i--); if i is equal to zero trace("blastoff!"); exit(0);
When you wish to write complex programs using predicates, try to first visualize your algorithm in this manner, and then transform each path of your conditional constructs into a separate clause and predicate.
Now let us combine predicates with a new provider, the
syscall
provider, and create our first real D
tracing program. The syscall
provider permits
you to enable probes on entry to or return from any Oracle Linux
system call. The next example uses DTrace to observe every time
your shell performs a read()
or
write()
system call. First, open two windows,
one to use for DTrace and the other containing the shell process
that you are going to watch. In the second window, type the
following command to obtain the process ID of this shell:
# echo $$ 2860
Now go back to your first window and type the following D program
and save it in a file named rw.d
. As you type
in the program, replace the integer constant
2860
with the process ID of the shell that was
printed in response to your echo
command.
syscall::read:entry, syscall::write:entry /pid == 2860/ { }
Notice that the body of rw.d
's probe clause is
left empty because the program is only intended to trace
notification of probe firings and not to trace any additional
data. Once you have typed in rw.d
, use
dtrace to start your experiment and then go to
your second shell window and type a few commands, pressing return
after each command. As you type, you should see
dtrace report probe firings in your first
window, similar to the following example:
# dtrace -s rw.d dtrace: script 'rw.d' matched 2 probes CPU ID FUNCTION:NAME 1 7 write:entry 1 5 read:entry 0 7 write:entry 0 5 read:entry 0 7 write:entry 0 5 read:entry 0 7 write:entry 0 5 read:entry 0 7 write:entry 1 7 write:entry 1 7 write:entry 1 5 read:entry ...^C
You are now watching your shell perform read()
and write()
system calls to read a character
from your terminal window and echo back the result. This example
includes many of the concepts described so far and a few new ones
as well. First, to instrument read()
and
write()
in the same manner, the script uses a
single probe clause with multiple probe descriptions by separating
the descriptions with commas like this:
syscall::read:entry, syscall::write:entry
For readability, each probe description appears on its own line. This arrangement is not strictly required, but it makes for a more readable script. Next the script defines a predicate that matches only those system calls that are executed by your shell process:
/pid == 2860/
The predicate uses the predefined DTrace variable
pid
, which always evaluates to the process ID
associated with the thread that fired the corresponding probe.
DTrace provides many built-in variable definitions for useful
things like the process ID. The following table lists a few DTrace
variables you can use to write your first D programs.
Variable Name | Data Type | Meaning |
---|---|---|
|
|
Current |
|
|
Name of the current process's executable file |
|
|
Process ID of the current process |
|
|
Thread ID of the current thread |
|
|
Current probe description's provider field |
|
|
Current probe description's module field |
|
|
Current probe description's function field |
|
|
Current probe description's name field |
Now that you've written a real instrumentation program, try
experimenting with it on different processes running on your
system by changing the process ID and the system call probes that
are instrumented. Then, you can make one more simple change and
turn rw.d
into a very simple version of a
system call tracing tool like strace. An empty
probe description field acts as a wildcard, matching any probe, so
change your program to the following new source code to trace any
system call executed by your shell:
syscall:::entry /pid == 2860/ { }
Try typing a few commands in the shell such as cd, ls, and date and see what your DTrace program reports.
Output Formatting Examples
System call tracing is a powerful way to observe the behavior of
many user processes. The following example improves upon the
earlier rw.d
program by formatting its output
so you can more easily understand the output. Type the following
program and save it in a file called
stracerw.d
:
syscall::read:entry, syscall::write:entry /pid == $1/ { printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2); } syscall::read:return, syscall::write:return /pid == $1/ { printf("\tt = %d\n", arg1); }
In this example, the constant 2860
is replaced
with the label $1
in each predicate. This label
enables you to specify the process of interest as an
argument to the script: $1
is replaced by the value of the first argument when the script is
compiled. To execute stracerw.d, use the
dtrace options -q and
-s, followed by the process ID of your shell as
the final argument. The -q option indicates
that dtrace should be quiet and suppress the
header line and the CPU and ID columns shown in the preceding
examples. As a result, you only see the output for the data that
you explicitly trace. Type the following command, replacing
2860
with the process ID of a shell process,
and then press return a few times in the specified shell:
# dtrace -q -s stracerw.d 2860 t = 1 write(2, 0x7fa621b9b000, 1) t = 1 write(1, 0x7fa621b9c000, 22) t = 22 write(2, 0x7fa621b9b000, 20) t = 20 read(0, 0x7fff60f74b8f, 1) t = 1 write(2, 0x7fa621b9b000, 1) t = 1 write(1, 0x7fa621b9c000, 22) t = 22 write(2, 0x7fa621b9b000, 20) t = 20 read(0, 0x7fff60f74b8f, 1) t = 1 write(2, 0x7fa621b9b000, 1) t = 1 write(1, 0x7fa621b9c000, 22) t = 22 write(2, 0x7fa621b9b000, 20) t = 20 read(0, 0x7fff60f74b8f, 1)^C #
Now let us examine your D program and its output in more detail.
First, a clause similar to the earlier program instruments each of
the shell's calls to read()
and
write()
. But for this example, we use a new
function, printf
, to trace the data and print
it out in a specific format:
syscall::read:entry, syscall::write:entry /pid == $1/ { printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2); }
The printf
function combines the ability to
trace data, as if by the trace
function used
earlier, with the ability to output the data and other text in a
specific format that you describe. The printf
function tells DTrace to trace the data associated with each
argument after the first argument, and then to format the results
using the rules described by the first printf
argument, known as a format string.
The format string is a regular string that contains any number of
format conversions, each beginning with the %
character, that describe how to format the corresponding argument.
The first conversion in the format string corresponds to the
second printf
argument, the second conversion
to the third argument, and so on. All of the text between
conversions is printed verbatim. The character following the
%
conversion character describes the format to
use for the corresponding argument. Here are the meanings of the
three format conversions used in stracerw.d
.
Format Conversion | Description |
---|---|
|
Print the corresponding value as a decimal integer |
|
Print the corresponding value as a string |
|
Print the corresponding value as a hexadecimal integer |
DTrace printf
works just like the C
printf()
library routine or the shell
printf utility. If you have never seen
printf
before, the formats and options are
explained in detail in Output Formatting. You should read
this chapter carefully even if you are already familiar with
printf
from another language. In D,
printf
is provided as a built-in and some new
format conversions are available to you designed specifically for
DTrace.
To help you write correct programs, the D compiler validates each
printf
format string against its argument list.
Try changing probefunc
in the clause above to
the integer 123
. If you run the modified
program, you will see an error message telling you that the string
format conversion %s
is not appropriate for use
with an integer argument:
# dtrace -q -s stracerw.d dtrace: failed to compile script stracerw.d: line 5: printf( ) argument #2 is incompatible with conversion #1 prototype: conversion: %s prototype: char [] or string (or use stringof) argument: int #
To print the name of the read or write system call and its
arguments, use the printf
statement:
printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2);
to trace the name of the current probe function and the first
three integer arguments to the system call, available in the
DTrace variables arg0
, arg1
,
and arg2
. For more information about probe
arguments, see Built-In Variables. The first
argument to read()
and
write()
is a file descriptor, printed in
decimal. The second argument is a buffer address, formatted as a
hexadecimal value. The final argument is the buffer size,
formatted as a decimal value. The format specifier
%4d
is used for the third argument to indicate
that the value should be printed using the %d
format conversion with a minimum field width of 4 characters. If
the integer is less than 4 characters wide,
printf
inserts extra blanks to align the
output.
To print the result of the system call and complete each line of output, use the following clause:
syscall::read:return, syscall::write:return /pid == $1/ { printf("\tt = %d\n", arg1); }
Notice that the syscall
provider also publishes
a probe named return
for each system call in
addition to entry
. The DTrace variable
arg1
for the syscall return
probes evaluates to the system call's return value. The return
value is formatted as a decimal integer. The character sequences
beginning with backwards slashes in the format string expand to
tab (\t
) and newline (\n
)
respectively. These escape sequences help you
print or record characters that are difficult to type. D supports
the same set of escape sequences as C, C++, and the Java
programming language. For a complete list of escape sequences, see
Constants.
Array Overview
D permits you to define variables that are integers, as well as other types to represent strings and composite types called structs and unions. If you are familiar with C programming, you will be happy to know you can use any type in D that you can in C. If you are not a C expert, do not worry: the different kinds of data types are all described in Types, Operators, and Expressions.
D also supports arrays. Linearly indexed scalar arrays, familiar to C programmers, are discussed in Array Declarations and Storage.
More powerful and commonly used are associative arrays, which are indexed with tuples. Each associative array has a particular type signature. That is, its tuples all have the same number of elements, those elements of consistent type and in the same order, and its values are all of the same type. D associative arrays are described further in Associative Arrays.
Associative Array Example
For example, the following D statements access an associative
array, whose values must all be type int
and
whose tuples must all have signature
string,int
, setting an element to 456 and
then incrementing it to 457:
a["hello", 123] = 456; a["hello", 123]++;
Now let us use an associative array in a D program. Type the
following program and save it in a file named
rwtime.d
:
syscall::read:entry, syscall::write:entry /pid == $1/ { ts[probefunc] = timestamp; } syscall::read:return, syscall::write:return /pid == $1 && ts[probefunc] != 0/ { printf("%d nsecs", timestamp - ts[probefunc]); }
As with stracerw.d, specify the ID of the shell process when you execute rwtime.d. If you type a few shell commands, you will see the time elapsed during each system call. Type in the following command and then press return a few times in your other shell:
# dtrace -s rwtime.d `/usr/bin/pgrep -n bash` dtrace: script 'rwtime.d' matched 4 probes CPU ID FUNCTION:NAME 0 8 write:return 51962 nsecs 0 8 write:return 45257 nsecs 0 8 write:return 40787 nsecs 1 6 read:return 925959305 nsecs 1 8 write:return 46934 nsecs 1 8 write:return 41626 nsecs 1 8 write:return 176839 nsecs ... ^C #
To trace the elapsed time for each system call, you must
instrument both the entry to and return from
read()
and write()
and
measure the time at each point. Then, on return from a given
system call, you must compute the difference between our first
and second timestamp. You could use separate variables for each
system call, but this would make the program annoying to extend
to additional system calls. Instead, it is easier to use an
associative array indexed by the probe function name. The
following is the first probe clause:
syscall::read:entry, syscall::write:entry /pid == $1/ { ts[probefunc] = timestamp; }
This clause defines an array named ts
and
assigns the appropriate member the value of the DTrace variable
timestamp
. This variable returns the value of
an always-incrementing nanosecond counter. When the entry
timestamp is saved, the corresponding return probe samples
timestamp
again and reports the difference
between the current time and the saved value:
syscall::read:return, syscall::write:return /pid == $1 && ts[probefunc] != 0/ { printf("%d nsecs", timestamp - ts[probefunc]); }
The predicate on the return probe requires that DTrace is
tracing the appropriate process and that the corresponding
entry
probe has already fired and assigned
ts[probefunc]
a non-zero value. This trick
eliminates invalid output when DTrace first starts. If your
shell is already waiting in a read()
system
call for input when you execute dtrace, the
read:return
probe fires without a preceding
read:entry
for this first
read()
and ts[probefunc]
will evaluate to zero because it has not yet been assigned.
External Symbols and Types
DTrace instrumentation executes inside the Oracle Linux operating system kernel. So, in addition to accessing special DTrace variables and probe arguments, you can also access kernel data structures, symbols, and types. These capabilities enable advanced DTrace users, administrators, service personnel, and driver developers to examine low-level behavior of the operating system kernel and device drivers. The reading list at the start of this guide includes books that can help you learn more about Oracle Linux operating system internals.
D uses the back quote character (`
) as a
special scoping operator for accessing symbols that are defined in
the operating system and not in your D program. For example, the
Oracle Linux kernel contains a C declaration of a system variable named
max_pfn
. This variable is declared in C in the
kernel source code as follows:
unsigned long max_pfn
To trace the value of this variable in a D program, you can write the following D statement:
trace(`max_pfn);
DTrace associates each kernel symbol with the type that is used for the symbol in the corresponding operating system C code, which provides easy source-based access to the native operating system data structures.
To use external operating system variables, you will need access to the corresponding operating system source code.
Kernel symbol names are kept in a separate namespace from D
variable and function identifiers, so you do not need to be
concerned about these names conflicting with your D variables.
When you prefix a variable with a back quote, the D compiler
searches the known kernel symbols and uses the list of loaded
modules to find a matching variable definition. Because the Oracle Linux
kernel supports dynamically loaded modules with separate symbol
namespaces, the same variable name might be used more than once in
the active operating system kernel. You can resolve these name
conflicts by specifying the name of the kernel module that
contains the variable to be accessed prior to the back quote in
the symbol name. For example, you would refer to the address of
the _bar
function that is provided by a kernel
module named foo
as follows:
foo`_bar
You can apply any of the D operators to external variables, except
for those that modify values, subject to the usual rules for
operand types. When required, the D compiler loads the variable
names that correspond to active kernel modules, so you do not need
to declare these variables. You may not apply any operator to an
external variable that modifies its value, such as
=
or +=
. For safety reasons,
DTrace prevents you from damaging or corrupting the state of the
software that you are observing.
When you access external variables from a D program, you are accessing the internal implementation details of another program, such as the operating system kernel or its device drivers. These implementation details do not form a stable interface upon which you can rely. Any D programs you write that depend on these details might cease to work when you next upgrade the corresponding piece of software. For this reason, external variables are typically used to debug performance or functionality problems by using DTrace. To learn more about the stability of your D programs, see DTrace Stability Features.
You have now completed a whirlwind tour of DTrace and have learned many of the basic DTrace building blocks that are necessary to build larger and more complex D programs. The remaining portions of this chapter describe the complete set of rules for D and demonstrate how DTrace can make complex performance measurements and functional analysis of the system easy. Later, you will learn how to use DTrace to connect user application behavior to system behavior, which provides you with the capability to analyze your entire software stack.
Types, Operators, and Expressions
D provides the ability to access and manipulate a variety of data objects: variables and data structures can be created and modified, data objects that are defined in the operating system kernel and user processes can be accessed, and integer, floating-point, and string constants can be declared. D provides a superset of the ANSI C operators that are used to manipulate objects and create complex expressions. This section describes the detailed set of rules for types, operators, and expressions.
Identifier Names and Keywords
D identifier names are composed of uppercase and lowercase
letters, digits, and underscores, where the first character must
be a letter or underscore. All identifier names beginning with
an underscore (_
) are reserved for use by the
D system libraries. You should avoid using these names in your D
programs. By convention, D programmers typically use mixed-case
names for variables and all uppercase names for constants.
D language keywords are special identifiers that are reserved for use in the programming language syntax itself. These names are always specified in lowercase and must not be used for the names of D variables. The following table lists the keywords that are reserved for use by the D language.
Table 2-2 D Keywords
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D reserves for use as keywords a superset of the ANSI C
keywords. The keywords reserved for future use by the D language
are marked with “*
”. The D compiler
produces a syntax error if you attempt to use a keyword that is
reserved for future use. The keywords that are defined by D but
not defined by ANSI C are marked with
“+
”. D provides the complete set of types
and operators found in ANSI C. The major difference in D
programming is the absence of control-flow constructs. Note that
keywords associated with control-flow in ANSI C are reserved for
future use in D.
Data Types and Sizes
D provides fundamental data types for integers and floating-point constants. Arithmetic may only be performed on integers in D programs. Floating-point constants may be used to initialize data structures, but floating-point arithmetic is not permitted in D. In Oracle Linux, D provides a 64-bit data model for use in writing programs. However, a 32-bit data model is not supported. The data model used when executing your program is the native data model that is associated with the active operating system kernel, which must also be 64-bit.
The names of the integer types and their sizes in the 64-bit data model are shown in the following table. Integers are always represented in twos-complement form in the native byte-encoding order of your system.
Table 2-3 D Integer Data Types
Type Name | 64-bit Size |
---|---|
|
1 byte |
|
2 bytes |
|
4 bytes |
|
8 bytes |
|
8 bytes |
Integer types can be prefixed with the signed or unsigned qualifier. If no sign qualifier is present, it is assumed that the type is signed. The D compiler also provides the type aliases that are listed in the following table.
Table 2-4 D Integer Type Aliases
Type Name | Description |
---|---|
|
1-byte signed integer |
|
2-byte signed integer |
|
4-byte signed integer |
|
8-byte signed integer |
|
Signed integer of size equal to a pointer |
|
1-byte unsigned integer |
|
2-byte unsigned integer |
|
4-byte unsigned integer |
|
8-byte unsigned integer |
|
Unsigned integer of size equal to a pointer |
These type aliases are equivalent to using the name of the
corresponding base type listed in the previous table and are
appropriately defined for each data model. For example, the
uint8_t
type name is an alias for the type
unsigned char
. See
Type and Constant Definitions for information about how
to define your own type aliases for use in D programs.
Note:
The predefined type aliases cannot be used in files that are included by the preprocessor.
D provides floating-point types for compatibility with ANSI C
declarations and types. Floating-point operators are not
supported in D, but floating-point data objects can be traced
and formatted with the printf
function. You
can use the floating-point types that are listed in the
following table.
Table 2-5 D Floating-Point Data Types
Type Name | 64-bit Size |
---|---|
|
4 bytes |
|
8 bytes |
|
16 bytes |
D also provides the special type string
to
represent ASCII strings. Strings are discussed in more detail in
DTrace Support for Strings.
Constants
Integer constants can be written in decimal
(12345
), octal (012345
),
or hexadecimal (0x12345
) format. Octal (base
8) constants must be prefixed with a leading zero. Hexadecimal
(base 16) constants must be prefixed with either
0x
or 0X
. Integer
constants are assigned the smallest type among
int
, long
, and
long long
that can represent their value. If
the value is negative, the signed version of the type is used.
If the value is positive and too large to fit in the signed type
representation, the unsigned type representation is used. You
can apply one of the suffixes listed in the following table to
any integer constant to explicitly specify its D type.
Suffix | D type |
---|---|
|
|
|
|
|
|
|
|
|
|
Floating-point constants are always written in decimal format
and must contain either a decimal point
(12.345
), an exponent
(123e45
), or both (
123.34e-5
). Floating-point constants are assigned the
type double
by default. You can apply one of
the suffixes listed in the following table to any floating-point
constant to explicitly specify its D type.
Suffix | D type |
---|---|
|
|
|
|
Character constants are written as a single character or escape
sequence that is enclosed in a pair of single quotes
('a'
). Character constants are assigned the
int
type rather than char
and are equivalent to an integer constant with a value that is
determined by that character's value in the ASCII character set.
See the ascii(7)
manual page for a list of
characters and their values. You can also use any of the special
escape sequences that are listed in the following table in your
character constants. D supports the same escape sequences as
those found in ANSI C.
Table 2-6 Character Escape Sequences
Escape Sequence | Represents | Escape Sequence | Represents |
---|---|---|---|
|
alert |
|
backslash |
|
backspace |
|
question mark |
|
form feed |
|
single quote |
|
newline |
|
double quote |
|
carriage return |
|
octal value 0oo |
|
horizontal tab |
|
hexadecimal value 0xhh |
|
vertical tab |
|
null character |
You can include more than one character specifier inside single quotes to create integers with individual bytes that are initialized according to the corresponding character specifiers. The bytes are read left-to-right from your character constant and assigned to the resulting integer in the order corresponding to the native endianness of your operating environment. Up to eight character specifiers can be included in a single character constant.
Strings constants of any length can be composed by enclosing
them in a pair of double quotes ("hello"
). A
string constant may not contain a literal newline character. To
create strings containing newlines, use the
\n
escape sequence instead of a literal
newline. String constants can contain any of the special
character escape sequences that are shown for character
constants previously. Similar to ANSI C, strings are represented
as arrays of characters terminated by a null character
(\0
) that is implicitly added to each string
constant you declare. String constants are assigned the special
D type string
. The D compiler provides a set
of special features for comparing and tracing character arrays
that are declared as strings. See
DTrace Support for Strings for more information.
Arithmetic Operators
D provides the binary arithmetic operators that are described in the following table for use in your programs. These operators all have the same meaning for integers that they do in ANSI C.
Table 2-7 Binary Arithmetic Operators
Operator | Description |
---|---|
|
Integer addition |
|
Integer subtraction |
|
Integer multiplication |
|
Integer division |
|
Integer modulus |
Arithmetic in D may only be performed on integer operands or on pointers. See Pointers and Scalar Arrays. Arithmetic may not be performed on floating-point operands in D programs. The DTrace execution environment does not take any action on integer overflow or underflow. You must specifically check for these conditions in situations where overflow and underflow can occur.
However, the DTrace execution environment does automatically
check for and report division by zero errors resulting from
improper use of the /
and
%
operators. If a D program executes an
invalid division operation, DTrace automatically disables the
affected instrumentation and reports the error. Errors that are
detected by DTrace have no effect on other DTrace users or on
the operating system kernel. You therefore do not need to be
concerned about causing any damage if your D program
inadvertently contains one of these errors.
In addition to these binary operators, the +
and -
operators can also be used as unary
operators as well, and these operators have higher precedence
than any of the binary arithmetic operators. The order of
precedence and associativity properties for all of the D
operators is presented in
Table 2-12. You can control
precedence by grouping expressions in parentheses
(()
).
Relational Operators
D provides the binary relational operators that are described in the following table for use in your programs. These operators all have the same meaning that they do in ANSI C.
Table 2-8 D Relational Operators
Operator | Description |
---|---|
|
Left-hand operand is less than right-operand |
|
Left-hand operand is less than or equal to right-hand operand |
|
Left-hand operand is greater than right-hand operand |
|
Left-hand operand is greater than or equal to right-hand operand |
|
Left-hand operand is equal to right-hand operand |
|
Left-hand operand is not equal to right-hand operand |
Relational operators are most frequently used to write D
predicates. Each operator evaluates to a value of type
int
, which is equal to one if the condition
is true
, or zero if it is
false
.
Relational operators can be applied to pairs of integers,
pointers, or strings. If pointers are compared, the result is
equivalent to an integer comparison of the two pointers
interpreted as unsigned integers. If strings are compared, the
result is determined as if by performing a
strcmp()
on the two operands. The following
table shows some example D string comparisons and their results.
D string comparison | Result |
---|---|
|
Returns 1 ( |
|
Returns 1 ( |
|
Returns 0 ( |
Relational operators can also be used to compare a data object associated with an enumeration type with any of the enumerator tags defined by the enumeration. Enumerations are a facility for creating named integer constants and are described in more detail in Type and Constant Definitions.
Logical Operators
D provides the binary logical operators that are listed in the following table for use in your programs. The first two operators are equivalent to the corresponding ANSI C operators.
Table 2-9 D Logical Operators
Operator | Description |
---|---|
|
Logical |
|
Logical |
|
Logical |
Logical operators are most frequently used in writing D
predicates. The logical AND
operator performs
the following short-circuit evaluation: if the left-hand operand
is false, the right-hand expression is not evaluated. The
logical OR
operator also performs the
following short-circuit evaluation: if the left-hand operand is
true, the right-hand expression is not evaluated. The logical
XOR
operator does not short-circuit. Both
expression operands are always evaluated.
In addition to the binary logical operators, the unary
!
operator can be used to perform a logical
negation of a single operand: it converts a zero operand into a
one and a non-zero operand into a zero. By convention, D
programmers use !
when working with integers
that are meant to represent boolean values and ==
0
when working with non-boolean integers, although the
expressions are equivalent.
The logical operators may be applied to operands of integer or pointer types. The logical operators interpret pointer operands as unsigned integer values. As with all logical and relational operators in D, operands are true if they have a non-zero integer value and false if they have a zero integer value.
Bitwise Operators
D provides the binary operators that are listed in the following table for manipulating individual bits inside of integer operands. These operators all have the same meaning as in ANSI C.
Table 2-10 D Bitwise Operators
Operator | Description |
---|---|
|
Bitwise |
|
Bitwise |
|
Bitwise |
|
Shift the left-hand operand left by the number of bits specified by the right-hand operand |
|
Shift the left-hand operand right by the number of bits specified by the right-hand operand |
The binary &
operator is used to clear
bits from an integer operand. The binary |
operator is used to set bits in an integer operand. The binary
^
operator returns one in each bit position,
exactly where one of the corresponding operand bits is set.
The shift operators are used to move bits left or right in a given integer operand. Shifting left fills empty bit positions on the right-hand side of the result with zeroes. Shifting right using an unsigned integer operand fills empty bit positions on the left-hand side of the result with zeroes. Shifting right using a signed integer operand fills empty bit positions on the left-hand side with the value of the sign bit, also known as an arithmetic shift operation.
Shifting an integer value by a negative number of bits or by a number of bits larger than the number of bits in the left-hand operand itself produces an undefined result. The D compiler produces an error message if the compiler can detect this condition when you compile your D program.
In addition to the binary logical operators, the unary
~
operator may be used to perform a bitwise
negation of a single operand: it converts each zero bit in the
operand into a one bit, and each one bit in the operand into a
zero bit.
Assignment Operators
D provides the binary assignment operators that are listed in the folloiwng table for modifying D variables. You can only modify D variables and arrays. Kernel data objects and constants may not be modified using the D assignment operators. The assignment operators have the same meaning as they do in ANSI C.
Table 2-11 D Assignment Operators
Operator | Description |
---|---|
|
Set the left-hand operand equal to the right-hand expression value. |
|
Increment the left-hand operand by the right-hand expression value |
|
Decrement the left-hand operand by the right-hand expression value. |
|
Multiply the left-hand operand by the right-hand expression value. |
|
Divide the left-hand operand by the right-hand expression value. |
|
Modulo the left-hand operand by the right-hand expression value. |
|
Bitwise OR the left-hand operand with the right-hand expression value. |
|
Bitwise AND the left-hand operand with the right-hand expression value. |
|
Bitwise XOR the left-hand operand with the right-hand expression value. |
|
Shift the left-hand operand left by the number of bits specified by the right-hand expression value. |
|
Shift the left-hand operand right by the number of bits specified by the right-hand expression value. |
Aside from the assignment operator =
, the
other assignment operators are provided as shorthand for using
the =
operator with one of the other
operators that were described earlier. For example, the
expression x = x + 1
is equivalent to the
expression x += 1
, except that the expression
x
is evaluated one time. These assignment
operators adhere to the same rules for operand types as the
binary forms described earlier.
The result of any assignment operator is an expression equal to
the new value of the left-hand expression. You can use the
assignment operators or any of the operators described thus far
in combination to form expressions of arbitrary complexity. You
can use parentheses ()
to group terms in
complex expressions.
Increment and Decrement Operators
++
and
--
operators for incrementing and
decrementing pointers and integers. These operators have the
same meaning as they do in ANSI C. These operators can only be
applied to variables and they may be applied either before or
after the variable name. If the operator appears before the
variable name, the variable is first modified and then the
resulting expression is equal to the new value of the variable.
For example, the following two code fragments produce identical
results:
x += 1; y = x; y = ++x;
y = x; x -= 1; y = x--;
You can use the increment and decrement operators to create new
variables without declaring them. If a variable declaration is
omitted and the increment or decrement operator is applied to a
variable, the variable is implicitly declared to be of type
int64_t
.
The increment and decrement operators can be applied to integer or pointer variables. When applied to integer variables, the operators increment or decrement the corresponding value by one. When applied to pointer variables, the operators increment or decrement the pointer address by the size of the data type that is referenced by the pointer. Pointers and pointer arithmetic in D are discussed in Pointers and Scalar Arrays.
Conditional Expressions
Although D does not provide support for
if-then-else
constructs, it does provide
support for simple conditional expressions by using the
?
and :
operators. These
operators enable a triplet of expressions to be associated,
where the first expression is used to conditionally evaluate one
of the other two.
For example, the following D statement could be used to set a
variable x
to one of two strings, depending
on the value of i
:
x = i == 0 ? "zero" : "non-zero";
In the previous example, the expression i ==
0
is first evaluated to determine whether it is true
or false. If the expression is true, the second expression is
evaluated and its value is returned. If the expression is false,
the third expression is evaluated and its value is returned.
As with any D operator, you can use multiple
?:
operators in a single expression to create
more complex expressions. For example, the following expression
would take a char
variable
c
containing one of the characters
0-9
, a-f
, or
A-F
, and return the value of this character
when interpreted as a digit in a hexadecimal (base 16) integer:
hexval = (c >= '0' && c <= '9') ? c - '0' : (c >= 'a' && c <= 'f') ? c + 10 - 'a' : c + 10 - 'A';
To be evaluated for its truth value, the first expression that
is used with ?:
must be a pointer or integer.
The second and third expressions can be of any compatible types.
You may not construct a conditional expression where, for
example, one path returns a string and another path returns an
integer. The second and third expressions also may not invoke a
tracing function such as trace
or
printf
. If you want to conditionally trace
data, use a predicate instead. See
Predicate Examples for more information.
Type Conversions
When expressions are constructed by using operands of different but compatible types, type conversions are performed to determine the type of the resulting expression. The D rules for type conversions are the same as the arithmetic conversion rules for integers in ANSI C. These rules are sometimes referred to as the usual arithmetic conversions.
A simple way to describe the conversion rules is as follows:
each integer type is ranked in the order
char
, short
,
int
, long
, long
long
, with the corresponding unsigned types assigned a
rank higher than its signed equivalent, but below the next
integer type. When you construct an expression using two integer
operands such as x + y
and the operands are
of different integer types, the operand type with the highest
rank is used as the result type.
If a conversion is required, the operand with the lower rank is first promoted to the type of the higher rank. Promotion does not actually change the value of the operand: it simply extends the value to a larger container according to its sign. If an unsigned operand is promoted, the unused high-order bits of the resulting integer are filled with zeroes. If a signed operand is promoted, the unused high-order bits are filled by performing sign extension. If a signed type is converted to an unsigned type, the signed type is first sign-extended and then assigned the new, unsigned type that is determined by the conversion.
Integers and other types can also be explicitly cast from one type to another. In D, pointers and integers can be cast to any integer or pointer types, but not to other types. Rules for casting and promoting strings and character arrays are discussed in DTrace Support for Strings.
An integer or pointer cast is formed using an expression such as the following:
y = (int)x;
In this example, the destination type is enclosed in parentheses and used to prefix the source expression. Integers are cast to types of higher rank by performing promotion. Integers are cast to types of lower rank by zeroing the excess high-order bits of the integer.
Because D does not permit floating-point arithmetic, no floating-point operand conversion or casting is permitted and no rules for implicit floating-point conversion are defined.
Operator Precedence
Table 2-12 lists the D rules for operator precedence and associativity. These rules are somewhat complex, but they are necessary to provide precise compatibility with the ANSI C operator precedence rules. The following entries in the following table are in order from highest precedence to lowest precedence.
Table 2-12 D Operator Precedence and Associativity
Operators | Associativity |
---|---|
|
Left to right |
|
Right to left |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Left to right |
|
Right to left |
|
Right to left |
|
Left to right |
Several operators listed in the previous table that have not been discussed yet. These operators are described in subsequent chapters. The following table lists several miscellaneous operators that are provided by the D language.
Operators | Description | For More Information |
---|---|---|
|
Computes the size of an object. |
|
|
Computes the offset of a type member. |
|
|
Converts the operand to a string. |
|
|
Translates a data type. |
|
unary |
Computes the address of an object. |
|
unary |
Dereferences a pointer to an object. |
|
|
Accesses a member of a structure or union type. |
The comma (,
) operator that is listed in the
table is for compatibility with the ANSI C comma operator. It
can be used to evaluate a set of expressions in left-to-right
order and return the value of the right most expression. This
operator is provided strictly for compatibility with C and
should generally not be used.
The ()
entry listed in the table of operator
precedence represents a function call. For examples of calls to
functions, such as printf
and
trace
, see Output Formatting. A comma
is also used in D to list arguments to functions and to form
lists of associative array keys. Note that this comma is not the
same as the comma operator and does not guarantee left-to-right
evaluation. The D compiler provides no guarantee regarding the
order of evaluation of arguments to a function or keys to an
associative array. Note that you should be careful of using
expressions with interacting side-effects, such as the pair of
expressions i
and i++
, in
these contexts.
The []
entry listed in the table of operator
precedence represents an array or associative array reference.
Examples of associative arrays are presented in
Associative Arrays. A special kind of associative
array, called an aggregation, is described
in Aggregations. The []
operator
can also be used to index into fixed-size C arrays as well. See
Pointers and Scalar Arrays.
Variables
D provides two basic types of variables for use in your tracing programs: scalar variables and associative arrays. An aggregation is a special kind of array variable. See Aggregations for more information about aggregations.
To understand the scope of variables, consider the following figure.
Scope of Variables

In the figure, system execution is illustrated, showing elapsed time along the horizontal axis and thread number along the vertical axis. D probes fire at different times on different threads, and each time a probe fires, the D script is run. Any D variable would have one of the scopes that are described in the following table.
Scope | Syntax | Initial Value | Thread-safe? | Description |
---|---|---|---|---|
global |
|
0 |
No |
Any probe that fires on any thread accesses the same instance of the variable. |
Thread-local |
|
0 |
Yes |
Any probe that fires on a thread accesses the thread-specific instance of the variable. |
Clause-local |
|
Not defined |
Yes |
Any probe that fires accesses an instance of the variable specific to that particular firing of the probe. |
Note:
Note the following additional information:
-
Scalar variables and associative arrays have a global scope and are not multi-processor safe (MP-safe). Because the value of such variables can be changed by more than one processor, there is a chance that a variable can become corrupted if more than one probe modifies it.
-
Aggregations are MP-safe even though they have a global scope because independent copies are updated locally before a final aggregation produces the global result.
Scalar Variables
Scalar variables are used to represent individual, fixed-size data objects, such as integers and pointers. Scalar variables can also be used for fixed-size objects that are composed of one or more primitive or composite types. D provides the ability to create arrays of objects, as well as composite structures. DTrace also represents strings as fixed-size scalars by permitting them to grow to a predefined maximum length. Control over string length in your D program is discussed further in DTrace Support for Strings.
Scalar variables are created automatically the first time you
assign a value to a previously undefined identifier in your D
program. For example, to create a scalar variable named
x
of type int
, you can
simply assign it a value of type int
in any
probe clause, for example:
BEGIN { x = 123; }
Scalar variables that are created in this manner are
global variables: each one is defined once
and is visible in every clause of your D program. Any time that
you reference the x
identifier, you are
referring to a single storage location associated with this
variable.
Unlike ANSI C, D does not require explicit variable declarations. If you do want to declare a global variable and assign its name and type explicitly before using it, you can place a declaration outside of the probe clauses in your program, as shown in the following example:
int x; /* declare an integer x for later use */ BEGIN { x = 123; ... }
Explicit variable declarations are not necessary in most D programs, but sometimes are useful when you want to carefully control your variable types or when you want to begin your program with a set of declarations and comments documenting your program's variables and their meanings.
Unlike ANSI C declarations, D variable declarations may not
assign initial values. You must use a BEGIN
probe clause to assign any initial values. All global variable
storage is filled with zeroes by DTrace before you first
reference the variable.
The D language definition places no limit on the size and number of D variables. Limits are defined by the DTrace implementation and by the memory that is available on your system. The D compiler enforces any of the limitations that can be applied at the time you compile your program. See Options and Tunables for more about how to tune options related to program limits.
Associative Arrays
Associative arrays are used to represent collections of data elements that can be retrieved by specifying a name, which is called a key. D associative array keys are formed by a list of scalar expression values, called a tuple. You can think of the array tuple as an imaginary parameter list to a function that is called to retrieve the corresponding array value when you reference the array. Each D associative array has a fixed key signature consisting of a fixed number of tuple elements, where each element has a given, fixed type. You can define different key signatures for each array in your D program.
Associative arrays differ from normal, fixed-size arrays in that they have no predefined limit on the number of elements: the elements can be indexed by any tuple, as opposed to just using integers as keys, and the elements are not stored in preallocated, consecutive storage locations. Associative arrays are useful in situations where you would use a hash table or other simple dictionary data structure in a C, C++, or Java language program. Associative arrays provide the ability to create a dynamic history of events and state captured in your D program, which you can use to create more complex control flows.
To define an associative array, you write an assignment expression of the following form:
name [ key ] = expression ;
where name is any valid D identifier and key is a comma-separated list of one or more expressions.
For example, the following statement defines an associative
array a with key signature [ int, string
]
and stores the integer value 456
in a
location named by the tuple [123, "hello"]
:
a[123, "hello"] = 456;
The type of each object that is contained in the array is also
fixed for all elements in a given array. Because it was first
assigned by using the integer 456
, every
subsequent value that is stored in the array will also be of
type int
. You can use any of the assignment
operators that are defined in Types, Operators, and Expressions
to modify associative array elements, subject to the operand
rules defined for each operator. The D compiler produces an
appropriate error message if you attempt an incompatible
assignment. You can use any type with an associative array key
or value that can be used with a scalar variable.
You can reference an associative array by using any tuple that
is compatible with the array key signature. The rules for tuple
compatibility are similar to those for function calls and
variable assignments. That is, the tuple must be of the same
length and each type in the list of actual parameters and must
be compatible with the corresponding type in the formal key
signature. For example, for an associative array
x
that is defined as follows:
x[123ull] = 0;
The key signature is of type unsigned long
long
and the values are of type
int
. This array can also be referenced by
using the expression x['a']
because the tuple
consisting of the character constant 'a'
, of
type int
and length one, is compatible with
the key signature unsigned long long
,
according to the arithmetic conversion rules. These rules are
described in Type Conversions.
If you need to explicitly declare a D associative array before using it, you can create a declaration of the array name and key signature outside of the probe clauses in your program source code, for example:
int x[unsigned long long, char]; BEGIN { x[123ull, 'a'] = 456; }
Storage is allocated only for array elements with a nonzero value.
Note:
When an associative array is defined, references to any tuple of a compatible key signature are permitted, even if the tuple in question has not been previously assigned. Accessing an unassigned associative array element is defined to return a zero-filled object. A consequence of this definition is that underlying storage is not allocated for an associative array element until a non-zero value is assigned to that element. Conversely, assigning an associative array element to zero causes DTrace to deallocate the underlying storage.
This behavior is important because the dynamic variable space out of which associative array elements are allocated is finite; if it is exhausted when an allocation is attempted, the allocation fails and an error message indicating a dynamic variable drop is generated. Always assign zero to associative array elements that are no longer in use. See Options and Tunables for information about techniques that you can use to eliminate dynamic variable drops.
Thread-Local Variables
DTrace provides the ability to declare variable storage that is local to each operating system thread, as opposed to the global variables demonstrated earlier in this chapter. Thread-local variables are useful in situations where you want to enable a probe and mark every thread that fires the probe with some tag or other data. Creating a program to solve this problem is easy in D because thread-local variables share a common name in your D code, but refer to separate data storage that is associated with each thread.
Thread-local variables are referenced by applying the
->
operator to the special identifier
self
, for example:
syscall::read:entry { self->read = 1; }
This D fragment example enables the probe on the
read()
system call and associates a
thread-local variable named read
with each
thread that fires the probe. Similar to global variables,
thread-local variables are created automatically on their first
assignment and assume the type that is used on the right-hand
side of the first assignment statement, which is
int
in this example.
Each time the self->read
variable is
referenced in your D program, the data object that is referenced
is the one associated with the operating system thread that was
executing when the corresponding DTrace probe fired. You can
think of a thread-local variable as an associative array that is
implicitly indexed by a tuple that describes the thread's
identity in the system. A thread's identity is unique over the
lifetime of the system: if the thread exits and the same
operating system data structure is used to create a new thread,
this thread does not reuse the same DTrace thread-local storage
identity.
When you have defined a thread-local variable, you can reference it for any thread in the system, even if the variable in question has not been previously assigned for that particular thread. If a thread's copy of the thread-local variable has not yet been assigned, the data storage for the copy is defined to be filled with zeroes. As with associative array elements, underlying storage is not allocated for a thread-local variable until a non-zero value is assigned to it. Also, as with associative array elements, assigning zero to a thread-local variable causes DTrace to deallocate the underlying storage. Always assign zero to thread-local variables that are no longer in use. For other techniques to fine-tune the dynamic variable space from which thread-local variables are allocated, see Options and Tunables.
Thread-local variables of any type can be defined in your D program, including associative arrays. The following are some example thread-local variable definitions:
self->x = 123; /* integer value */ self->s = "hello"; /* string value */ self->a[123, 'a'] = 456; /* associative array */
Like any D variable, you do not need to explicitly declare
thread-local variables prior to using them. If you want to
create a declaration anyway, you can place one outside of your
program clauses by pre-pending the keyword
self
, for example:
self int x; /* declare int x as a thread-local variable */ syscall::read:entry { self->x = 123; }
Thread-local variables are kept in a separate namespace from
global variables so that you can reuse names. Remember that
x
and self->x
are not the
same variable if you overload names in your program.
The following example shows how to use thread-local variables.
In an editor, type the following program and save it in a file
named rtime.d
:
syscall::read:entry { self->t = timestamp; } syscall::read:return /self->t != 0/ { printf("%d/%d spent %d nsecs in read()\n", pid, tid, timestamp - self->t); /* * We are done with this thread-local variable; assign zero to it * to allow the DTrace runtime to reclaim the underlying storage. */ self->t = 0; }
Next, in your shell, start the program running. Wait a few seconds and you should begin to see some output. If no output appears, try running a few commands:
# dtrace -q -s rtime.d 3987/3987 spent 12786263 nsecs in read() 2183/2183 spent 13410 nsecs in read() 2183/2183 spent 12850 nsecs in read() 2183/2183 spent 10057 nsecs in read() 3583/3583 spent 14527 nsecs in read() 3583/3583 spent 12571 nsecs in read() 3583/3583 spent 9778 nsecs in read() 3583/3583 spent 9498 nsecs in read() 3583/3583 spent 9778 nsecs in read() 2183/2183 spent 13968 nsecs in read() 2183/2183 spent 72076 nsecs in read() ... ^C #
The rtime.d program uses a thread-local
variable that is named to capture a timestamp on entry to
read()
by any thread. Then, in the return
clause, the program prints the amount of time spent in
read()
by subtracting
self->t
from the current timestamp. The
built-in D variables pid
and
tid
report the process ID and thread ID of
the thread that is performing the read()
.
Because self->t
is no longer needed after
this information is reported, it is then assigned
0
to enable DTrace to reuse the underlying
storage that is associated with t
for the
current thread.
Typically, you see many lines of output without doing anything
because server processes and daemons are executing
read()
all the time behind the scenes. Try
changing the second clause of rtime.d
to use
the execname
variable to print out the name
of the process performing a read()
, for
example:
printf("%s/%d spent %d nsecs in read()\n", execname, tid, timestamp - self->t);
If you find a process that is of particular interest, add a
predicate to learn more about its read()
behavior, as shown in the following example:
syscall::read:entry /execname == "Xorg"/ { self->t = timestamp; }
Clause-Local Variables
The value of a D variable can be accessed whenever a probe fires. Variables describes how variables could have a different scope. For a global variable, the same instance of the variable is accessed from every thread. For thread-local, the instance of the variable is thread-specific.
Meanwhile, for a clause-local variable, the instance of the variable is specific to that particular firing of the probe. Clause-local is the narrowest scope. When a probe fires on a CPU, the D script is executed in program order. Each clause-local variable is instantiated with an undefined value the first time it is used in the script. The same instance of the variable is used in all clauses until the D script has completed execution for that particular firing of the probe.
Clause-local variables can be referenced and assigned by
prefixing with this->
:
BEGIN { this->secs = timestamp / 1000000000; ... }
If you want to declare a clause-local variable explicitly before
using it, you can do so by using the this
keyword:
this int x; /* an integer clause-local variable */ this char c; /* a character clause-local variable */ BEGIN { this->x = 123; this->c = 'D'; }
Note that if your program contains multiple clauses for a single
probe, any clause-local variables remain intact as the clauses
are executed, as shown in the following example. Type the
following source code and save it in a file named
clause.d
:
int me; /* an integer global variable */ this int foo; /* an integer clause-local variable */ tick-1sec { /* * Set foo to be 10 if and only if this is the first clause executed. */ this->foo = (me % 3 == 0) ? 10 : this->foo; printf("Clause 1 is number %d; foo is %d\n", me++ % 3, this->foo++); } tick-1sec { /* * Set foo to be 20 if and only if this is the first clause executed. */ this->foo = (me % 3 == 0) ? 20 : this->foo; printf("Clause 2 is number %d; foo is %d\n", me++ % 3, this->foo++); } tick-1sec { /* * Set foo to be 30 if and only if this is the first clause executed. */ this->foo = (me % 3 == 0) ? 30 : this->foo; printf("Clause 3 is number %d; foo is %d\n", me++ % 3, this->foo++); }
Because the clauses are always executed in program order, and because clause-local variables are persistent across different clauses that are enabling the same probe, running the preceding program always produces the same output:
# dtrace -q -s clause.d Clause 1 is number 0; foo is 10 Clause 2 is number 1; foo is 11 Clause 3 is number 2; foo is 12 Clause 1 is number 0; foo is 10 Clause 2 is number 1; foo is 11 Clause 3 is number 2; foo is 12 Clause 1 is number 0; foo is 10 Clause 2 is number 1; foo is 11 Clause 3 is number 2; foo is 12 Clause 1 is number 0; foo is 10 Clause 2 is number 1; foo is 11 Clause 3 is number 2; foo is 12 ^C
While clause-local variables are persistent across clauses that are enabling the same probe, their values are undefined in the first clause executed for a given probe. Be sure to assign each clause-local variable an appropriate value before using it or your program might have unexpected results.
Clause-local variables can be defined using any scalar variable type, but associative arrays may not be defined using clause-local scope. The scope of clause-local variables only applies to the corresponding variable data, not to the name and type identity defined for the variable. When a clause-local variable is defined, this name and type signature can be used in any subsequent D program clause.
You can use clause-local variables to accumulate intermediate results of calculations or as temporary copies of other variables. Access to a clause-local variable is much faster than access to an associative array. Therefore, if you need to reference an associative array value multiple times in the same D program clause, it is more efficient to copy it into a clause-local variable first and then reference the local variable repeatedly.
Built-In Variables
The following table provides a complete list of built-in D variables. All of these variables are scalar global variables.
Table 2-13 DTrace Built-In Variables
Variable | Description |
---|---|
|
The typed arguments, if any, to the current probe. The
|
|
The first ten input arguments to a probe, represented as raw 64-bit integers. Values are meaningful only for arguments defined for the current probe. |
|
The program counter location of the current kernel thread at the time the probe fired. |
|
The CPU chip identifier for the current physical chip. |
|
The CPU identifier for the current CPU. See sched Provider for more information. |
|
The CPU information for the current CPU. See sched Provider. |
|
The process state of the current thread. See proc Provider. |
|
The process state of the process associated with the current thread. See proc Provider. |
|
Is a |
|
The name of the current working directory of the process associated with the current thread. |
|
The enabled probe ID (EPID) for the current probe. This integer uniquely identifies a particular probe that is enabled with a specific predicate and set of actions. |
|
The error value returned by the last system call executed by this thread. |
|
The name that was passed to
|
|
The files that the current process has opened in an
Note:
You must load the |
|
The real group ID of the current process. |
|
The probe ID for the current probe. This ID is the system-wide unique identifier for the probe, as published by DTrace and listed in the output of dtrace -l. |
|
The interrupt priority level (IPL) on the current CPU at probe firing time. Note: This value is non-zero if interrupts are firing and zero otherwise. The non-zero value depends on whether preemption is active, as well as other factors, and can vary between kernel releases and kernel configurations. |
|
The latency group ID for the latency group of which the current CPU is a member. This value is always zero. |
|
The process ID of the current process. |
|
The parent process ID of the current process. |
|
The function name portion of the current probe's description. |
|
The module name portion of the current probe's description. |
|
The name portion of the current probe's description. |
|
The provider name portion of the current probe's description. |
|
The processor set ID for the processor set containing the current CPU. This value is always zero. |
|
The name of the |
|
The current thread's stack frame depth at probe firing time. |
|
The task ID of the current thread. |
|
The current value of a nanosecond timestamp counter. This counter increments from an arbitrary point in the past and should only be used for relative computations. |
|
The program counter location of the current user thread at the time the probe fired. |
|
The real user ID of the current process. |
|
The current thread's saved user-mode register values
at probe firing time. Use of the
|
|
The current value of a nanosecond timestamp counter that is virtualized to the amount of time that the current thread has been running on a CPU, minus the time spent in DTrace predicates and actions. This counter increments from an arbitrary point in the past and should only be used for relative time computations. |
|
The current number of nanoseconds since 00:00 Universal Coordinated Time, January 1, 1970. |
Functions that are built into the D language such as
trace
are discussed in
Actions and Subroutines.
External Variables
The D language uses the back quote character
(`
) as a special scoping operator for
accessing variables that are defined in the operating system and
not in your D program. For more information, see
External Symbols and Types.
Pointers and Scalar Arrays
Pointers are memory addresses of data objects in the operating system kernel or in the address space of a user process. D provides the ability to create and manipulate pointers and store them in variables and associative arrays. This section describes the D syntax for pointers, operators that can be applied to create or access pointers, and the relationship between pointers and fixed-size scalar arrays. Also discussed are issues relating to the use of pointers in different address spaces.
Note:
If you are an experienced C or C++ programmer, you can skim most of this section as the D pointer syntax is the same as the corresponding ANSI C syntax. Howevver, you should read Pointers and Addresses and Pointers to DTrace Objects, as these sections describe features and issues that are specific to DTrace.
Pointers and Addresses
The Linux operating system uses a technique called
virtual memory to provide each user process
with its own virtual view of the memory resources on your
system. A virtual view of memory resources is referred to as an
address space. An address space associates
a range of address values, either [0 ...
0xffffffff]
for a 32-bit address space or [0
... 0xffffffffffffffff]
for a 64-bit address space,
with a set of translations that the operating system and
hardware use to convert each virtual address to a corresponding
physical memory location. Pointers in D are data objects that
store an integer virtual address value and associate it with a D
type that describes the format of the data stored at the
corresponding memory location.
You can explicitly declare a D variable to be of pointer type by
first specifying the type of the referenced data and then
appending an asterisk (*
) to the type name.
Doing so indicates you want to declare a pointer type, as shown
in the following statement:
int *p;
This statement declares a D global variable named
p
that is a pointer to an integer. The
declaration means that p
is a 64-bit integer
with a value that is the address of another integer located
somewhere in memory. Because the compiled form of your D code is
executed at probe firing time inside the operating system kernel
itself, D pointers are typically pointers associated with the
kernel's address space. You can use the arch
command to determine the number of bits that are used for
pointers by the active operating system kernel.
If you want to create a pointer to a data object inside of the
kernel, you can compute its address by using the
&
operator. For example, the operating
system kernel source code declares an unsigned long
max_pfn
variable. You could trace the address of this
variable by tracing the result of applying the
&
operator to the name of that object in
D:
trace(&`max_pfn);
The *
operator can be used to refer to the
object addressed by the pointer, and acts as the inverse of the
&
operator. For example, the following
two D code fragments are equivalent in meaning:
q = &`max_pfn; trace(*q); trace(`max_pfn);
In this example, the first fragment creates a D global variable
pointer q
. Because the
max_pfn
object is of type unsigned
long
, the type of &`max_pfn
is
unsigned long *
(that is, pointer to
unsigned long
), implicitly setting the type
of q
. Tracing the value of
*q
follows the pointer back to the data object
max_pfn
. This fragment is therefore the same
as the second fragment, which directly traces the value of the
data object by using its name.
Pointer Safety
If you are a C or C++ programmer, you might be a bit apprehensive after reading the previous section because you know that misuse of pointers in your programs can cause your programs to crash. DTrace, however, is a robust, safe environment for executing your D programs. Take note that these types of mistakes cannot cause program crashes. You might write a buggy D program, but invalid D pointer accesses do not cause DTrace or the operating system kernel to fail or crash in any way. Instead, the DTrace software detects any invalid pointer accesses, disables your instrumentation, and reports the problem back to you for debugging.
If you have previously programmed in the Java programming language, you are probably aware that the Java language does not support pointers for precisely the same reasons of safety. Pointers are needed in D because they are an intrinsic part of the operating system's implementation in C, but DTrace implements the same kind of safety mechanisms that are found in the Java programming language to prevent buggy programs from damaging themselves or each other. DTrace's error reporting is similar to the runtime environment for the Java programming language that detects a programming error and reports an exception.
To observe DTrace's error handling and reporting, you could
write a deliberately bad D program using pointers. For example,
in an editor, type the following D program and save it in a file
named badptr.d
:
BEGIN { x = (int *)NULL; y = *x; trace(y); }
The badptr.d
program creates a D pointer
named x
that is a pointer to
int
. The program assigns this pointer the
special invalid pointer value NULL
, which is
a built-in alias for address 0. By convention, address
0
is always defined as invalid so that
NULL
can be used as a sentinel value in C and
D programs. The program uses a cast expression to convert
NULL
to be a pointer to an integer. The
program then dereferences the pointer by using the expression
*x
, assigns the result to another variable
y
, and then attempts to trace
y
. When the D program is executed, DTrace
detects an invalid pointer access when the statement y
= *x
is executed and reports the following error:
# dtrace -s badptr.d dtrace: script 'badptr.d' matched 1 probe dtrace: error on enabled probe ID 1 (ID 1: dtrace:::BEGIN): invalid address (0x0) in action #2 at DIF offset 4 ^C #
Notice that the D program moves past the error and continues to
execute; the system and all observed processes remain
unperturbed. You can also add an ERROR
probe
to your script to handle D errors. For details about the DTrace
error mechanism, see ERROR Probe.
Array Declarations and Storage
In addition to the dynamic associative arrays that are described in Variables, D supports scalar arrays. Scalar arrays are a fixed-length group of consecutive memory locations that each store a value of the same type. Scalar arrays are accessed by referring to each location with an integer, starting from zero. Scalar arrays correspond directly in concept and syntax with arrays in C and C++. Scalar arrays are not used as frequently in D as associative arrays and their more advanced counterparts aggregations. You might, however, need to use scalar arrays to access existing operating system array data structures that are declared in C. Aggregations are described in Aggregations.
A D scalar array of 5 integers is declared by using the type
int
and suffixing the declaration with the
number of elements in square brackets, for example:
int a[5];
Figure 2-* shows a visual representation of the array storage:
Scalar Array Representation
![The figure illustrates the elements a[0] through a[4] of the array, declared as a[5], arranged side-by-side in memory. The figure illustrates the elements a[0] through a[4] of the array, declared as a[5], arranged side-by-side in memory.](img/array.png)
The D expression a[0]
refers to the first
array element, a[1]
refers to the second, and
so on. From a syntactic perspective, scalar arrays and
associative arrays are very similar. You can declare an
associative array of integers referenced by an integer key as
follows:
int a[int];
You can also reference this array using the expression
a[0]
. But, from a storage and implementation
perspective, the two arrays are very different. The static array
a
consists of five consecutive memory
locations numbered from zero, and the index refers to an offset
in the storage that is allocated for the array. On the other
hand, an associative array has no predefined size and does not
store elements in consecutive memory locations. In addition,
associative array keys have no relationship to the corresponding
value storage location. You can access associative array
elements a[0]
and a[-5]
and only two words of storage are allocated by DTrace, and these
might or might not be consecutive. Associative array keys are
abstract names for the corresponding values and have no
relationship to the value storage locations.
If you create an array using an initial assignment and use a
single integer expression as the array index , for example,
a[0] = 2
, the D compiler always creates a new
associative array, even though in this expression
a
could also be interpreted as an assignment
to a scalar array. Scalar arrays must be predeclared in this
situation so that the D compiler can recognize the definition of
the array size and infer that the array is a scalar array.
Pointer and Array Relationship
Pointers and scalar arrays have a special relationship in D,
just as they do in ANSI C. A scalar array is represented by a
variable that is associated with the address of its first
storage location. A pointer is also the address of a storage
location with a defined type. Thus, D permits the use of the
array []
index notation with both pointer
variables and array variables. For example, the following two D
fragments are equivalent in meaning:
p = &a[0]; trace(p[2]); trace(a[2]);
In the first fragment, the pointer p
is
assigned to the address of the first element in scalar array
a
by applying the &
operator to the expression a[0]
. The
expression p[2]
traces the value of the third
array element (index 2). Because p
now
contains the same address associated with a
,
this expression yields the same value as
a[2]
, shown in the second fragment. One
consequence of this equivalence is that C and D permit you to
access any index of any pointer or array. Array bounds checking
is not performed for you by the compiler or the DTrace runtime
environment. If you access memory beyond the end of a scalar
array's predefined size, you either get an unexpected result or
DTrace reports an invalid address error, as shown in the
previous example. As always, you cannot damage DTrace itself or
your operating system, but you do need to debug your D program.
The difference between pointers and arrays is that a pointer variable refers to a separate piece of storage that contains the integer address of some other storage. Whereas, an array variable names the array storage itself, not the location of an integer that in turn contains the location of the array. Figure 2-* illustrates this difference.
Pointer and Array Storage
![The diagram illustrates the pointer p with the value 0x12345678, which is the address of the first element (a[0]) of the array declared as a[5]. The diagram illustrates the pointer p with the value 0x12345678, which is the address of the first element (a[0]) of the array declared as a[5].](img/arrptr.png)
This difference is manifested in the D syntax if you attempt to
assign pointers and scalar arrays. If x
and
y
are pointer variables, the expression
x = y
is legal; it copies the pointer address
in y
to the storage location that is named by
x
. If x
and
y
are scalar array variables, the expression
x = y
is not legal. Arrays may not be
assigned as a whole in D. However, an array variable or symbol
name can be used in any context where a pointer is permitted. If
p
is a pointer and a
is a
scalar array, the statement p = a
is
permitted. This statement is equivalent to the statement
p = &a[0]
.
Pointer Arithmetic
Because pointers are just integers that are used as addresses of other objects in memory, D provides a set of features for performing arithmetic on pointers. However, pointer arithmetic is not identical to integer arithmetic. Pointer arithmetic implicitly adjusts the underlying address by multiplying or dividing the operands by the size of the type referenced by the pointer.
The following D fragment illustrates this property:
int *x; BEGIN { trace(x); trace(x + 1); trace(x + 2); }
This fragment creates an integer pointer x
and then traces its value, its value incremented by one, and its
value incremented by two. If you create and execute this
program, DTrace reports the integer values 0
,
4
, and 8
.
Since x
is a pointer to an
int
(size 4 bytes), incrementing
x
adds 4 to the underlying pointer value.
This property is useful when using pointers to refer to
consecutive storage locations such as arrays. For example, if
x
was assigned to the address of an array
a
, similar to what is shown in
Figure 2-*, the expression x
+ 1
would be equivalent to the expression
&a[1]
. Similarly, the expression
*(x + 1)
would refer to the value
a[1]
. Pointer arithmetic is implemented by
the D compiler whenever a pointer value is incremented by using
the +
, ++
, or
=+
operators. Pointer arithmetic is also
applied as follows; when an integer is subtracted from a pointer
on the left-hand side, when a pointer is subtracted from another
pointer, or when the --
operator is applied
to a pointer.
For example, the following D program would trace the result
2
:
int *x, *y; int a[5]; BEGIN { x = &a[0]; y = &a[2]; trace(y - x); }
Generic Pointers
Sometimes it is useful to represent or manipulate a generic
pointer address in a D program without specifying the type of
data referred to by the pointer. Generic pointers can be
specified by using the type void *
, where the
keyword void
represents the absence of
specific type information, or by using the built-in type alias
uintptr_t
, which is aliased to an unsigned
integer type of size that is appropriate for a pointer in the
current data model. You may not apply pointer arithmetic to an
object of type void *
, and these pointers
cannot be dereferenced without casting them to another type
first. You can cast a pointer to the
uintptr_t
type when you need to perform
integer arithmetic on the pointer value.
Pointers to void
can be used in any context
where a pointer to another data type is required, such as an
associative array tuple expression or the right-hand side of an
assignment statement. Similarly, a pointer to any data type can
be used in a context where a pointer to void
is required. To use a pointer to a non-void
type in place of another non-void
pointer
type, an explicit cast is required. You must always use explicit
casts to convert pointers to integer types, such as
uintptr_t
, or to convert these integers back
to the appropriate pointer type.
Multi-Dimensional Arrays
Multi-dimensional scalar arrays are used infrequently in D, but
are provided for compatibility with ANSI C and are for observing
and accessing operating system data structures that are created
by using this capability in C. A multi-dimensional array is
declared as a consecutive series of scalar array sizes enclosed
in square brackets []
following the base
type. For example, to declare a fixed-size, two-dimensional
rectangular array of integers of dimensions that is 12 rows by
34 columns, you would write the following declaration:
int a[12][34];
A multi-dimensional scalar array is accessed by using similar
notation. For example, to access the value stored at row
0
and column 1
, you would
write the D expression as follows:
a[0][1]
Storage locations for multi-dimensional scalar array values are computed by multiplying the row number by the total number of columns declared and then adding the column number.
Be careful not to confuse the multi-dimensional array syntax
with the D syntax for associative array accesses, that is,
a[0][1]
, is not the same as
a[0,1]
). If you use an incompatible tuple
with an associative array or attempt an associative array access
of a scalar array, the D compiler reports an appropriate error
message and refuses to compile your program.
Pointers to DTrace Objects
The D compiler prohibits you from using the
&
operator to obtain pointers to DTrace
objects such as associative arrays, built-in functions, and
variables. You are prohibited from obtaining the address of
these variables so that the DTrace runtime environment is free
to relocate them as needed between probe firings . In this way,
DTrace can more efficiently manage the memory required for your
programs. If you create composite structures, it is possible to
construct expressions that do retrieve the kernel address of
your DTrace object storage. You should avoid creating such
expressions in your D programs. If you need to use such an
expression, do not rely on the address being the same across
probe firings.
In ANSI C, pointers can also be used to perform indirect
function calls or to perform assignments, such as placing an
expression using the unary *
dereference
operator on the left-hand side of an assignment operator. In D,
these types of expressions using pointers are not permitted. You
may only assign values directly to D variables by specifying
their name or by applying the array index operator
[]
to a D scalar or associative array. You
may only call functions that are defined by the DTrace
environment by name, as specified in
Actions and Subroutines. Indirect function calls using
pointers are not permitted in D.
Pointers and Address Spaces
A pointer is an address that provides a translation within some virtual address space to a piece of physical memory. DTrace executes your D programs within the address space of the operating system kernel itself. The Linux system manages many address spaces: one for the operating system kernel and one for each user process. Because each address space provides the illusion that it can access all of the memory on the system, the same virtual address pointer value can be reused across address spaces, but translate to different physical memory. Therefore, when writing D programs that use pointers, you must be aware of the address space corresponding to the pointers you intend to use.
For example, if you use the syscall
provider
to instrument entry to a system call that takes a pointer to an
integer or array of integers as an argument, for example,
pipe()
, it would not be valid to dereference
that pointer or array using the *
or
[]
operators because the address in question
is an address in the address space of the user process that
performed the system call. Applying the *
or
[]
operators to this address in D would
result in kernel address space access, which would result in an
invalid address error or in returning unexpected data to your D
program, depending on whether the address happened to match a
valid kernel address.
To access user-process memory from a DTrace probe, you must
apply one of the copyin
,
copyinstr
, or copyinto
functions that are described in Actions and Subroutines to
the user address space pointer. To avoid confusion, take care
when writing your D programs to name and comment variables
storing user addresses appropriately. You can also store user
addresses as uintptr_t
so that you do not
accidentally compile D code that dereferences them. Techniques
for using DTrace on user processes are described in
User Process Tracing.
DTrace Support for Strings
DTrace provides support for tracing and manipulating strings. This section describes the complete set of D language features for declaring and manipulating strings. Unlike ANSI C, strings in D have their own built-in type and operator support to enable you to easily and unambiguously use them in your tracing programs.
String Representation
In DTrace, strings are represented as an array of characters
terminated by a null byte (that is, a byte whose value is zero,
usually written as '\0'
). The visible part of
the string is of variable length, depending on the location of
the null byte, but DTrace stores each string in a fixed-size
array so that each probe traces a consistent amount of data.
Strings cannot exceed the length of the predefined string limit.
However, the limit can be modified in your D program or on the
dtrace command line by tuning the
strsize
option. See
Options and Tunables for more information about tunable
DTrace options. The default string limit is 256 bytes.
The D language provides an explicit string
type rather than using the type char *
to
refer to strings. The string type is equivalent to char
*
, in that it is the address of a sequence of
characters, but the D compiler and D functions such as
trace
provide enhanced capabilities when
applied to expressions of type string. For example, the string
type removes the ambiguity of type char *
when you need to trace the actual bytes of a string.
In the following D statement, if s
is of type
char *
, DTrace traces the value of the
pointer s
, which means it traces an integer
address value:
trace(s);
In the following D statement, by the definition of the
*
operator, the D compiler dereferences the
pointer s
and traces the single character at
that location:
trace(*s);
These behaviors enable you to manipulate character pointers that refer to either single characters, or to arrays of byte-sized integers that are not strings and do not end with a null byte.
In the next D statement, if s
is of type
string
, the string type indicates to the D
compiler that you want DTrace to trace a null terminated string
of characters whose address is stored in the variable
s
:
trace(s);
You can also perform lexical comparison of expressions of type string. See String Comparison.
String Constants
String constants are enclosed in pairs of double quotes
(""
) and are automatically assigned the type
string
by the D compiler. You can define
string constants of any length, limited only by the amount of
memory DTrace is permitted to consume on your system. The
terminating null byte (\0
) is added
automatically by the D compiler to any string constants that you
declare. The size of a string constant object is the number of
bytes associated with the string, plus one additional byte for
the terminating null byte.
A string constant may not contain a literal newline character.
To create strings containing newlines, use the
\n
escape sequence instead of a literal
newline. String constants can also contain any of the special
character escape sequences that are defined for character
constants. See Table 2-6.
String Assignment
Unlike the assignment of char *
variables,
strings are copied by value and not by reference. The string
assignment operator =
copies the actual bytes
of the string from the source operand up to and including the
null byte to the variable on the left-hand side, which must be
of type string
. You can create a new string
variable by assigning it an expression of type
string
.
For example, the D statement:
s = "hello";
would create a new variable s
of type
string
and copy the six bytes of the string
"hello"
into it (five printable characters,
plus the null byte). String assignment is analogous to the C
library function strcpy()
, with the exception
that if the source string exceeds the limit of the storage of
the destination string, the resulting string is automatically
truncated by a null byte at this limit.
You can also assign to a string variable an expression of a type
that is compatible with strings. In this case, the D compiler
automatically promotes the source expression to the string type
and performs a string assignment. The D compiler permits any
expression of type char *
or of type
char[n]
, that is, a scalar array of
char
of any size, to be promoted to a string.
String Conversion
Expressions of other types can be explicitly converted to type
string
by using a cast expression or by
applying the special stringof
operator, which
are equivalent in the following meaning:
s = (string) expression; s = stringof (expression);
The expression is interpreted as an address to the string.
The stringof
operator binds very tightly to
the operand on its right-hand side. Typically, parentheses are
used to surround the expression for clarity. Although, they are
not strictly necessary.
Any expression that is a scalar type, such as a pointer or
integer, or a scalar array address may be converted to string.
Expressions of other types such as void
may
not be converted to string
. If you
erroneously convert an invalid address to a string, the DTrace
safety features prevents you from damaging the system or DTrace,
but you might end up tracing a sequence of undecipherable
characters.
String Comparison
D overloads the binary relational operators and permits them to
be used for string comparisons, as well as integer comparisons.
The relational operators perform string comparison whenever both
operands are of type string
or when one
operand is of type string
and the other
operand can be promoted to type string
. See
String Assignment for a detailed description.
See also Table 2-14, which lists the
relational operators that can be used to compare strings.
Table 2-14 D Relational Operators for Strings
Operator | Description |
---|---|
|
Left-hand operand is less than right-operand. |
|
Left-hand operand is less than or equal to right-hand operand. |
|
Left-hand operand is greater than right-hand operand. |
|
Left-hand operand is greater than or equal to right-hand operand. |
|
Left-hand operand is equal to right-hand operand. |
|
Left-hand operand is not equal to right-hand operand. |
As with integers, each operator evaluates to a value of type
int
, which is equal to one if the condition
is true or zero if it is false.
The relational operators compare the two input strings
byte-by-byte, similarly to the C library routine
strcmp()
. Each byte is compared by using its
corresponding integer value in the ASCII character set until a
null byte is read or the maximum string length is reached. See
the ascii(7)
manual page for more
information. Some example D string comparisons and their results
are shown in the following table.
D string comparison | Result |
---|---|
|
Returns 1 (true) |
|
Returns 1 (true) |
|
Returns 0 (false) |
Note:
Seemingly identical Unicode strings might compare as being different if one or the other of the strings is not normalized.
Structs and Unions
Collections of related variables can be grouped together into composite data objects called structs and unions. You define these objects in D by creating new type definitions for them. You can use your new types for any D variables, including associative array values. This section explores the syntax and semantics for creating and manipulating these composite types and the D operators that interact with them.
Structs
The D keyword struct
, short for
structure, is used to introduce a new type
that is composed of a group of other types. The new
struct
type can be used as the type for D
variables and arrays, enabling you to define groups of related
variables under a single name. D structs are the same as the
corresponding construct in C and C++. If you have programmed in
the Java programming language previously, think of a D struct as
a class that contains only data members and no methods.
Suppose you want to create a more sophisticated system call
tracing program in D that records a number of things about each
read()
and write()
system
call that is executed by your shell, for example, the elapsed
time, number of calls, and the largest byte count passed as an
argument.
You could write a D clause to record these properties in three separate associative arrays, as shown in the following example:
int maxbytes[string]; /* declare maxbytes */ syscall::read:entry, syscall::write:entry /pid == 12345/ { ts[probefunc] = timestamp; calls[probefunc]++; maxbytes[probefunc] = arg2 > maxbytes[probefunc] ? arg2 : maxbytes[probefunc]; }
This clause, however, is inefficient because DTrace must create
three separate associative arrays and store separate copies of
the identical tuple values corresponding to
probefunc
for each one. Instead, you can
conserve space and make your program easier to read and maintain
by using a struct.
First, declare a new struct
type at the top
of the D program source file:
struct callinfo { uint64_t ts; /* timestamp of last syscall entry */ uint64_t elapsed; /* total elapsed time in nanoseconds */ uint64_t calls; /* number of calls made */ size_t maxbytes; /* maximum byte count argument */ };
The struct
keyword is followed by an optional
identifier that is used to refer back to the new type, which is
now known as struct callinfo
. The struct
members are then enclosed in a set of braces
{}
and the entire declaration is terminated
by a semicolon (;
). Each struct member is
defined by using the same syntax as a D variable declaration,
with the type of the member listed first followed by an
identifier naming the member and another semicolon
(;
).
The struct
declaration simply defines the new
type. It does not create any variables or allocate any storage
in DTrace. When declared, you can use struct
callinfo
as a type throughout the remainder of your D
program. Each variable of type struct
callinfo
stores a copy of the four variables that are
described by our structure template. The members are arranged in
memory in order, according to the member list, with padding
space introduced between members, as required for data object
alignment purposes.
You can use the member identifier names to access the individual
member values using the “.
” operator by
writing an expression of the following form:
variable-name.member-name
The following example is an improved program that uses the new
structure type. In a text editor, type the following D program
and save it in a file named rwinfo.d
:
struct callinfo { uint64_t ts; /* timestamp of last syscall entry */ uint64_t elapsed; /* total elapsed time in nanoseconds */ uint64_t calls; /* number of calls made */ size_t maxbytes; /* maximum byte count argument */ }; struct callinfo i[string]; /* declare i as an associative array */ syscall::read:entry, syscall::write:entry /pid == $1/ { i[probefunc].ts = timestamp; i[probefunc].calls++; i[probefunc].maxbytes = arg2 > i[probefunc].maxbytes ? arg2 : i[probefunc].maxbytes; } syscall::read:return, syscall::write:return /i[probefunc].ts != 0 && pid == $1/ { i[probefunc].elapsed += timestamp - i[probefunc].ts; } END { printf(" calls max bytes elapsed nsecs\n"); printf("------ ----- --------- -------------\n"); printf(" read %5d %9d %d\n", i["read"].calls, i["read"].maxbytes, i["read"].elapsed); printf(" write %5d %9d %d\n", i["write"].calls, i["write"].maxbytes, i["write"].elapsed); }
When you have typed the program, run the dtrace -q -s
rwinfo.d command, specifying one of your shell
processes. Then, type a few commands in your shell. When you
have finished typing the shell commands, type
Ctrl-C to fire the END
probe and print the results:
# dtrace -q -s rwinfo.d `pgrep -n bash` ^C calls max bytes elapsed nsecs ------ ----- --------- ------------- read 25 1024 8775036488 write 33 22 1859173
Pointers to Structs
Referring to structs by using pointers is very common in C and
D. You can use the operator ->
to access
struct members through a pointer. If struct s
has a member m
, and you have a pointer to
this struct named sp
, where
sp
is a variable of type struct s
*
, you can either use the *
operator to first dereference the sp
pointer
to access the member:
struct s *sp; (*sp).m
Or, you can use the ->
operator as shorthand
for this notation. The following two D fragments are equivalent
if sp
is a pointer to a struct:
(*sp).m sp->m
DTrace provides several built-in variables that are pointers to
structs. For example, the pointer curpsinfo
refers to struct
psinfo
and its content provides a snapshot of information about the
state of the process associated with the thread that fired the
current probe. The following table lists a few example
expressions that use curpsinfo
, including
their types and their meanings.
Example Expression | Type | Meaning |
---|---|---|
|
|
Current process ID |
|
|
Executable file name |
|
|
Initial command-line arguments |
For more information, see psinfo_t.
The next example uses the pr_fname
member to
identify a process of interest. In an editor, type the following
script and save it in a file named procfs.d
:
syscall::write:entry / curpsinfo->pr_fname == "date" / { printf("%s run by UID %d\n", curpsinfo->pr_psargs, curpsinfo->pr_uid); }
This clause uses the expression
curpsinfo->pr_fname
to access and match the
command name so that the script selects the correct
write()
requests before tracing the
arguments. Notice that by using operator ==
with a left-hand argument that is an array of
char
and a right-hand argument that is a
string, the D compiler infers that the left-hand argument should
be promoted to a string and a string comparison should be
performed. Type the command dtrace -q -s
procs.d in one shell and then type the
date command several times in another shell.
The output that is displayed by DTrace is similar to the
following:
# dtrace -q -s procfs.d date run by UID 500 /bin/date run by UID 500 date -R run by UID 500 ... ^C #
Complex data structures are used frequently in C programs, so the ability to describe and reference structs from D also provides a powerful capability for observing the inner workings of the Oracle Linux operating system kernel and its system interfaces.
Unions
Unions are another kind of composite type that is supported by ANSI C and D and are closely related to structs. A union is a composite type where a set of members of different types are defined and the member objects all occupy the same region of storage. A union is therefore an object of variant type, where only one member is valid at any given time, depending on how the union has been assigned. Typically, some other variable or piece of state is used to indicate which union member is currently valid. The size of a union is the size of its largest member. The memory alignment that is used for the union is the maximum alignment required by the union members.
Member Sizes and Offsets
You can determine the size in bytes of any D type or expression,
including a struct
or
union
, by using the sizeof
operator. The sizeof
operator can be applied
either to an expression or to the name of a type surrounded by
parentheses, as illustrated in the following two examples:
sizeof expression sizeof (type-name)
For example, the expression sizeof (uint64_t)
would return the value 8
, and the expression
sizeof (callinfo.ts)
would also return
8
, if inserted into the source code of the
previous example program. The formal return type of the
sizeof
operator is the type alias
size_t
, which is defined as an unsigned
integer that is the same size as a pointer in the current data
model and is used to represent byte counts. When the
sizeof
operator is applied to an expression,
the expression is validated by the D compiler, but the resulting
object size is computed at compile time and no code for the
expression is generated. You can use sizeof
anywhere an integer constant is required.
You can use the companion operator offsetof
to determine the offset in bytes of a struct or union member
from the start of the storage that is associated with any object
of the struct
or union
type. The offsetof
operator is used in an
expression of the following form:
offsetof (type-name, member-name)
Here, type-name is the name of any
struct
or union
type or
type alias, and member-name is the
identifier naming a member of that struct or union. Similar to
sizeof
, offsetof
returns a
size_t
and you can use it anywhere in a D
program that an integer constant can be used.
Bit-Fields
D also permits the definition of integer struct and union members of arbitrary numbers of bits, known as bit-fields. A bit-field is declared by specifying a signed or unsigned integer base type, a member name, and a suffix indicating the number of bits to be assigned for the field, as shown in the following example:
struct s { int a : 1; int b : 3; int c : 12; };
The bit-field width is an integer constant that is separated from the member name by a trailing colon. The bit-field width must be positive and must be of a number of bits not larger than the width of the corresponding integer base type. Bit-fields that are larger than 64 bits may not be declared in D. D bit-fields provide compatibility with and access to the corresponding ANSI C capability. Bit-fields are typically used in situations when memory storage is at a premium or when a struct layout must match a hardware register layout.
A bit-field is a compiler construct that automates the layout of
an integer and a set of masks to extract the member values. The
same result can be achieved by simply defining the masks
yourself and using the &
operator. The C
and D compilers attempt to pack bits as efficiently as possible,
but they are free to do so in any order or fashion they desire.
Therefore, bit-fields are not guaranteed to produce identical
bit layouts across differing compilers or architectures. If you
require stable bit layout, you should construct the bit masks
yourself and extract the values by using the
&
operator.
A bit-field member is accessed by simply specifying its name in
combination with the “.
” or
->
operators, like any other struct or union
member. The bit-field is automatically promoted to the next
largest integer type for use in any expressions. Because
bit-field storage cannot be aligned on a byte boundary or be a
round number of bytes in size, you may not apply the
sizeof
or offsetof
operators to a bit-field member. The D compiler also prohibits
you from taking the address of a bit-field member by using the
&
operator.
Type and Constant Definitions
This section describes how to declare type aliases and named constants in D. It also discusses D type and namespace management for program and operating system types and identifiers.
typedefs
The typedef
keyword is used to declare an
identifier as an alias for an existing type. Like all D type
declarations, typedef
is used outside of
probe clauses in a declaration of the following form:
typedef existing-type new-type ;
where existing-type is any type
declaration and new-type is an
identifier to be used as the alias for this type. For example,
the D compiler uses the following declaration internally to
create the uint8_t
type alias:
typedef unsigned char uint8_t;
You can use type aliases anywhere that a normal type can be
used, such as the type of a variable or associative array value
or tuple member. You can also combine typedef
with more elaborate declarations such as the definition of a new
struct
, as shown in the following example:
typedef struct foo { int x; int y; } foo_t;
In the previous example, struct foo
is
defined using the same type as its alias,
foo_t
. Linux C system headers often use the
suffix _t
to denote a
typedef
alias.
Enumerations
Defining symbolic names for constants in a program eases readability and simplifies the process of maintaining the program in the future. One method is to define an enumeration, which associates a set of integers with a set of identifiers called enumerators that the compiler recognizes and replaces with the corresponding integer value. An enumeration is defined by using a declaration such as the following:
enum colors { RED, GREEN, BLUE };
The first enumerator in the enumeration, RED
,
is assigned the value zero and each subsequent identifier is
assigned the next integer value.
You can also specify an explicit integer value for any enumerator by suffixing it with an equal sign and an integer constant, as shown in the following example:
enum colors { RED = 7, GREEN = 9, BLUE };
The enumerator BLUE
is assigned the value
10
by the compiler because it has no value
specified and the previous enumerator is set to
9
. When an enumeration is defined, the
enumerators can be used anywhere in a D program that an integer
constant is used. In addition, the enumeration enum
colors
is also defined as a type that is equivalent to
an int
. The D compiler allows a variable of
enum
type to be used anywhere an
int
can be used and will allow any integer
value to be assigned to a variable of enum
type. You can also omit the enum
name in the
declaration, if the type name is not needed.
Enumerators are visible in all subsequent clauses and declarations in your program. Therefore, you cannot define the same enumerator identifier in more than one enumeration. However, you can define more than one enumerator with the same value in either the same or different enumerations. You may also assign integers that have no corresponding enumerator to a variable of the enumeration type.
The D enumeration syntax is the same as the corresponding syntax in ANSI C. D also provides access to enumerations that are defined in the operating system kernel and its loadable modules. Note that these enumerators are not globally visible in your D program. Kernel enumerators are only visible if you specify one as an argument in a comparison with an object of the corresponding enumeration type. This feature protects your D programs against inadvertent identifier name conflicts, with the large collection of enumerations that are defined in the operating system kernel.
The following example D program displays information about I/O
requests. The program uses the enumerators
B_READ
and B_WRITE
to
differentiate between read and write operations:
io:::done, io:::start, io:::wait-done, io:::wait-start { printf("%8s %10s: %d %16s (%s size %d @ sect %d)\n", args[1]->dev_statname, probename, timestamp, execname, args[0]->b_flags & B_READ ? "R" : args[0]->b_flags & B_WRITE ? "W" : "?", args[0]->b_bcount, args[0]->b_blkno); }
Inlines
D named constants can also be defined by using
inline
directives, which provide a more
general means of creating identifiers that are replaced by
predefined values or expressions during compilation. Inline
directives are a more powerful form of lexical replacement than
the #define
directive provided by the C
preprocessor because the replacement is assigned an actual type
and is performed by using the compiled syntax tree and not
simply a set of lexical tokens. An inline
directive is specified by using a declaration of the following
form:
inline type name = expression;
where type is a type declaration of an existing type, name is any valid D identifier that is not previously defined as an inline or global variable, and expression is any valid D expression. After the inline directive is processed, the D compiler substitutes the compiled form of expression for each subsequent instance of name in the program source.
For example, the following D program would trace the string
"hello
" and integer value
123
:
inline string hello = "hello"; inline int number = 100 + 23; BEGIN { trace(hello); trace(number); }
An inline name can be used anywhere a global variable of the corresponding type is used. If the inline expression can be evaluated to an integer or string constant at compile time, then the inline name can also be used in contexts that require constant expressions, such as scalar array dimensions.
The inline expression is validated for syntax errors as part of
evaluating the directive. The expression result type must be
compatible with the type that is defined by the
inline
, according to the same rules used for
the D assignment operator (=
). An inline
expression may not reference the inline
identifier itself: recursive definitions are not permitted.
The DTrace software packages install a number of D source files
in the system directory
/usr/lib64/dtrace/installed-version
,
which contain inline directives that you can use in your D
programs.
For example, the signal.d
library includes
directives of the following form:
inline int SIGHUP = 1; inline int SIGINT = 2; inline int SIGQUIT = 3; ...
These inline definitions provide you with access to the current
set of Oracle Linux signal names, as described in the
sigaction(2)
manual page. Similarly, the
errno.d
library contains inline directives
for the C errno
constants that are described
in the errno(3)
manual page.
By default, the D compiler includes all of the provided D library files automatically so that you can use these definitions in any D program.
Type Namespaces
In traditional languages such as ANSI C, type visibility is determined by whether a type is nested inside of a function or other declaration. Types declared at the outer scope of a C program are associated with a single global namespace and are visible throughout the entire program. Types that are defined in C header files are typically included in this outer scope. Unlike these languages, D provides access to types from multiple outer scopes.
D is a language that facilitates dynamic observability across
multiple layers of a software stack, including the operating
system kernel, an associated set of loadable kernel modules, and
user processes that are running on the system. A single D
program can instantiate probes to gather data from multiple
kernel modules or other software entities that are compiled into
independent binary objects. Therefore, more than one data type
of the same name, perhaps with different definitions, might be
present in the universe of types that are available to DTrace
and the D compiler. To manage this situation, the D compiler
associates each type with a namespace, which is identified by
the containing program object. Types from a particular program
object can be accessed by specifying the object name and the
back quote (`
) scoping operator in any type
name.
For example, for a kernel module named foo
that contains the following C type declaration:
typedef struct bar { int x; } bar_t;
The types struct bar
and
bar_t
could be accessed from D using the
following type names:
struct foo`bar foo`bar_t
The back quote operator can be used in any context where a type name is appropriate, including when specifying the type for D variable declarations or cast expressions in D probe clauses.
The D compiler also provides two special, built-in type
namespaces that use the names C and D, respectively. The C type
namespace is initially populated with the standard ANSI C
intrinsic types, such as int
. In addition,
type definitions that are acquired by using the C preprocessor
(cpp), by running the dtrace
-C command, are processed by and added to the C scope.
As a result, you can include C header files containing type
declarations that are already visible in another type namespace
without causing a compilation error.
The D type namespace is initially populated with the D type
intrinsics, such as int
and
string
, as well as the built-in D type
aliases, such as uint64_t
. Any new type
declarations that appear in the D program source are
automatically added to the D type namespace. If you create a
complex type such as a struct
in a D program
consisting of member types from other namespaces, the member
types are copied into the D namespace by the declaration.
When the D compiler encounters a type declaration that does not specify an explicit namespace using the back quote operator, the compiler searches the set of active type namespaces to find a match by using the specified type name. The C namespace is always searched first, followed by the D namespace. If the type name is not found in either the C or D namespace, the type namespaces of the active kernel modules are searched in load address order, which does not guarantee any ordering properties among the loadable modules. To avoid type name conflicts with other kernel modules, you should use the scoping operator when accessing types that are defined in loadable kernel modules.
The D compiler uses the compressed ANSI C debugging information that is provided with the core Linux kernel modules to automatically access the types that are associated with the operating system source code, without the need to access the corresponding C include files. Note that this symbolic debugging information might not be available for all kernel modules on your system. The D compiler reports an error if you attempt to access a type within the namespace of a module that lacks the compressed C debugging information that is intended for use with DTrace.