Pointers

Pointers are memory addresses of data objects and reference memory used by the OS, by the user program, or by the D script. Pointers in D are data objects that store an integer virtual address value and associate it with a D type that describes the format of the data stored at the corresponding memory location.

You can explicitly declare a D variable to be of pointer type by first specifying the type of the referenced data and then appending an asterisk (*) to the type name. Doing so indicates you want to declare a pointer type, as shown in the following statement:

int *p;

The statement declares a D global variable named p that's a pointer to an integer. The declaration means that p is a 64-bit integer with a value that's the address of another integer located somewhere in memory. Because the compiled form of the D code is run at probe firing time inside the kernel itself, D pointers are typically pointers associated with the kernel's address space.

To create a pointer to a data object inside the kernel, you can compute its address by using the & operator. For example, the kernel source code declares an unsigned long max_pfn variable. You could trace the address of this variable by tracing the result of applying the & operator to the name of that object in D:

trace(&`max_pfn);

The * operator can be used to specify the object addressed by the pointer, and acts as the inverse of the & operator. For example, the following two D code fragments are equivalent in meaning:

q = &`max_pfn; trace(*q);

trace(`max_pfn); 

In this example, the first fragment creates a D global variable pointer q. Because the max_pfn object is of type unsigned long, the type of &`max_pfn is unsigned long *, a pointer to unsigned long. The type of q is implicit in the declaration. Tracing the value of *q follows the pointer back to the data object max_pfn. This fragment is therefore the same as the second fragment, which directly traces the value of the data object by using its name.

Pointer Safety

DTrace is a robust, safe environment for running D programs. You might write a buggy D program, but invalid D pointer accesses don't cause DTrace or the OS kernel to fail or crash in any way. Instead, the DTrace software detects any invalid pointer accesses, and returns a BADADDR fault; the current clause execution quits, an ERROR probe fires, and tracing continues unless the program called exit for the ERROR probe.

Pointers are required in D because they're an intrinsic part of the OS's implementation in C, but DTrace implements the same kind of safety mechanisms that are found in the Java programming language to prevent buggy programs from affecting themselves or each other. DTrace's error reporting is similar to the runtime environment for the Java programming language that detects a programming error and reports an exception.

To observe DTrace's error handling and reporting, you could write a deliberately bad D program using pointers. For example, in an editor, type the following D program and save it in a file named badptr.d:

BEGIN
{
  x = (int *)NULL;
  y = *x;
  trace(y);
}

The badptr.d program uses a cast expression to convert NULL to be a pointer to an integer. The program then dereferences the pointer by using the expression *x, assigns the result to another variable y, and then tries to trace y. When the D program is run, DTrace detects an invalid pointer access when the statement y = *x is processed and reports the following error:

dtrace: script '/tmp/badptr.d' matched 1 probe
dtrace: error on enabled probe ID 2 (ID 1: dtrace:::BEGIN): invalid address (0x0) in action #1 at BPF pc 156

Notice that the D program moves past the error and continues to run; the system and all observed processes remain unperturbed. You can also add an ERROR probe to any script to handle D errors. For details about the DTrace error mechanism, see ERROR Probe.

Pointer and Array Relationship

A scalar array is represented by a variable that's associated with the address of its first storage location. A pointer is also the address of a storage location with a defined type. Thus, D permits the use of the array [] index notation with both pointer variables and array variables. For example, the following two D fragments are equivalent in meaning:

p = &a[0]; trace(p[2]);

trace(a[2]); 

In the first fragment, the pointer p is assigned to the address of the first element in scalar array a by applying the & operator to the expression a[0]. The expression p[2] traces the value of the third array element (index 2). Because p now contains the same address associated with a, this expression yields the same value as a[2], shown in the second fragment. One consequence of this equivalence is that D permits you to access any index of any pointer or array. If you access memory beyond the end of a scalar array's predefined size, you either get an unexpected result or DTrace reports an invalid address error.

The difference between pointers and arrays is that a pointer variable refers to a separate piece of storage that contains the integer address of some other storage; whereas, an array variable names the array storage itself, not the location of an integer that in turn contains the location of the array.

This difference is manifested in the D syntax if you try to assign pointers and scalar arrays. If x and y are pointer variables, the expression x = y is legal; it copies the pointer address in y to the storage location that's named by x. If x and y are scalar array variables, the expression x = y isn't legal. Arrays can't be assigned as a whole in D. If p is a pointer and a is a scalar array, the statement p = a is permitted. This statement is equivalent to the statement p = &a[0].

Pointer Arithmetic

As in C, pointer arithmetic in D isn't identical to integer arithmetic. Pointer arithmetic implicitly adjusts the underlying address by multiplying or dividing the operands by the size of the type referenced by the pointer.

The following D fragment illustrates this property:

int *x;

BEGIN
{
  trace(x);
  trace(x + 1);
  trace(x + 2);
}

This fragment creates an integer pointer x and then traces its value, its value incremented by one, and its value incremented by two. If you create and run this program, DTrace reports the integer values 0, 4, and 8.

Because x is a pointer to an int (size 4 bytes), incrementing x adds 4 to the underlying pointer value. This property is useful when using pointers to reference consecutive storage locations such as arrays. For example, if x was assigned to the address of an array a, the expression x + 1 would be equivalent to the expression &a[1]. Similarly, the expression *(x + 1) would reference the value a[1]. Pointer arithmetic is implemented by the D compiler whenever a pointer value is incremented by using the +, ++, or =+ operators. Pointer arithmetic is also applied as follows; when an integer is subtracted from a pointer on the left-hand side, when a pointer is subtracted from another pointer, or when the -- operator is applied to a pointer.

For example, the following D program would trace the result 2:

int *x, *y;
int a[5];

BEGIN
{
  x = &a[0];
  y = &a[2];
  trace(y - x);
}

Generic Pointers

Sometimes it's useful to represent or manipulate a generic pointer address in a D program without specifying the type of data referred to by the pointer. Generic pointers can be specified by using the type void *, where the keyword void represents the absence of specific type information, or by using the built-in type alias uintptr_t, which is aliased to an unsigned integer type of size that's appropriate for a pointer in the current data model. You can't apply pointer arithmetic to an object of type void *, and these pointers can't be dereferenced without casting them to another type first. You can cast a pointer to the uintptr_t type when you need to perform integer arithmetic on the pointer value.

Pointers to void can be used in any context where a pointer to another data type is required, such as an associative array tuple expression or the right-hand side of an assignment statement. Similarly, a pointer to any data type can be used in a context where a pointer to void is required. To use a pointer to a non-void type in place of another non-void pointer type, an explicit cast is required. You must always use explicit casts to convert pointers to integer types, such as uintptr_t, or to convert these integers back to the appropriate pointer type.

Pointers to DTrace Objects

The D compiler prohibits you from using the & operator to obtain pointers to DTrace objects such as associative arrays, built-in functions, and variables. You're prohibited from obtaining the address of these variables so that the DTrace runtime environment is free to relocate them as needed between probe firings . In this way, DTrace can more efficiently manage the memory required for programs. If you create composite structures, it's possible to construct expressions that retrieve the kernel address of DTrace object storage. Avoid creating such expressions in D programs. If you need to use such an expression, don't rely on the address being the same across probe firings.

Pointers and Address Spaces

A pointer is an address that provides a translation within some virtual address space to a piece of physical memory. DTrace runs D programs within the address space of the OS kernel itself. The Linux system manages many address spaces: one for the OS kernel itself, and one for each user process. Because each address space provides the illusion that it can access all the memory on the system, the same virtual address pointer value can be reused across address spaces, but translate to different physical memory. Therefore, when writing D programs that use pointers, you must be aware of the address space corresponding to the pointers you intend to use.

For example, if you use the syscall provider to instrument entry to a system call that takes a pointer to an integer or array of integers as an argument, such as, pipe(), it would not be valid to dereference that pointer or array using the * or [] operators because the address in question is an address in the address space of the user process that performed the system call. Applying the * or [] operators to this address in D would result in kernel address space access, which would result in an invalid address error or in returning unexpected data to the D program, depending on whether the address happened to match a valid kernel address.

To access user-process memory from a DTrace probe, you must apply one of the copyin, copyinstr, or copyinto functions. To avoid confusion, take care when writing D programs to name and comment variables storing user addresses appropriately. You can also store user addresses as uintptr_t so that you don't accidentally compile D code that dereferences them..