Next: Address Classes, Previous: Registers and Memory, Up: Target Architecture Definition
On almost all 32-bit architectures, the representation of a pointer is indistinguishable from the representation of some fixed-length number whose value is the byte address of the object pointed to. On such machines, the words “pointer” and “address” can be used interchangeably. However, architectures with smaller word sizes are often cramped for address space, so they may choose a pointer representation that breaks this identity, and allows a larger code address space.
For example, the Renesas D10V is a 16-bit VLIW processor whose instructions are 32 bits long1. If the D10V used ordinary byte addresses to refer to code locations, then the processor would only be able to address 64kb of instructions. However, since instructions must be aligned on four-byte boundaries, the low two bits of any valid instruction's byte address are always zero—byte addresses waste two bits. So instead of byte addresses, the D10V uses word addresses—byte addresses shifted right two bits—to refer to code. Thus, the D10V can use 16-bit words to address 256kb of code space.
However, this means that code pointers and data pointers have different
forms on the D10V. The 16-bit word 0xC020
refers to byte address
0xC020
when used as a data address, but refers to byte address
0x30080
when used as a code address.
(The D10V also uses separate code and data address spaces, which also affects the correspondence between pointers and addresses, but we're going to ignore that here; this example is already too long.)
To cope with architectures like this—the D10V is not the only
one!—gdb tries to distinguish between addresses, which are
byte numbers, and pointers, which are the target's representation
of an address of a particular type of data. In the example above,
0xC020
is the pointer, which refers to one of the addresses
0xC020
or 0x30080
, depending on the type imposed upon it.
gdb provides functions for turning a pointer into an address
and vice versa, in the appropriate way for the current architecture.
Unfortunately, since addresses and pointers are identical on almost all processors, this distinction tends to bit-rot pretty quickly. Thus, each time you port gdb to an architecture which does distinguish between pointers and addresses, you'll probably need to clean up some architecture-independent code.
Here are functions which convert between pointers and addresses:
Treat the bytes at buf as a pointer or reference of type type, and return the address it represents, in a manner appropriate for the current architecture. This yields an address gdb can use to read target memory, disassemble, etc. Note that buf refers to a buffer in gdb's memory, not the inferior's.
For example, if the current architecture is the Intel x86, this function extracts a little-endian integer of the appropriate length from buf and returns it. However, if the current architecture is the D10V, this function will return a 16-bit integer extracted from buf, multiplied by four if type is a pointer to a function.
If type is not a pointer or reference type, then this function will signal an internal error.
Store the address addr in buf, in the proper format for a pointer of type type in the current architecture. Note that buf refers to a buffer in gdb's memory, not the inferior's.
For example, if the current architecture is the Intel x86, this function stores addr unmodified as a little-endian integer of the appropriate length in buf. However, if the current architecture is the D10V, this function divides addr by four if type is a pointer to a function, and then stores it in buf.
If type is not a pointer or reference type, then this function will signal an internal error.
Assuming that val is a pointer, return the address it represents, as appropriate for the current architecture.
This function actually works on integral values, as well as pointers. For pointers, it performs architecture-specific conversions as described above for
extract_typed_address
.
Create and return a value representing a pointer of type type to the address addr, as appropriate for the current architecture. This function performs architecture-specific conversions as described above for
store_typed_address
.
Here are two functions which architectures can define to indicate the relationship between pointers and addresses. These have default definitions, appropriate for architectures on which all pointers are simple unsigned byte addresses.
Assume that buf holds a pointer of type type, in the appropriate format for the current architecture. Return the byte address the pointer refers to.
This function may safely assume that type is either a pointer or a C++ reference type.
Store in buf a pointer of type type representing the address addr, in the appropriate format for the current architecture.
This function may safely assume that type is either a pointer or a C++ reference type.
[1] Some D10V instructions are actually pairs of 16-bit sub-instructions. However, since you can't jump into the middle of such a pair, code addresses can only refer to full 32 bit instructions, which is what matters in this explanation.