Next: MMIX, Previous: ARM, Up: Machine Dependent
When generating a shared library, ld will by default generate import stubs suitable for use with a single sub-space application. The `--multi-subspace' switch causes ld to generate export stubs, and different (larger) import stubs suitable for use with multiple sub-spaces.
Long branch stubs and import/export stubs are placed by ld in stub sections located between groups of input sections. `--stub-group-size' specifies the maximum size of a group of input sections handled by one stub section. Since branch offsets are signed, a stub section may serve two groups of input sections, one group before the stub section, and one group after it. However, when using conditional branches that require stubs, it may be better (for branch prediction) that stub sections only serve one group of input sections. A negative value for `N' chooses this scheme, ensuring that branches to stubs always use a negative offset. Two special values of `N' are recognized, `1' and `-1'. These both instruct ld to automatically size input section groups for the branch types detected, with the same behaviour regarding stub placement as other positive or negative values of `N' respectively.
Note that `--stub-group-size' does not split input sections. A single input section larger than the group size specified will of course create a larger group (of one section). If input sections are too large, it may not be possible for a branch to reach its stub.
The `--vfp11-denorm-fix' switch enables a link-time workaround for a bug in certain VFP11 coprocessor hardware, which sometimes allows instructions with denorm operands (which must be handled by support code) to have those operands overwritten by subsequent instructions before the support code can read the intended values.
The bug may be avoided in scalar mode if you allow at least one intervening instruction between a VFP11 instruction which uses a register and another instruction which writes to the same register, or at least two intervening instructions if vector mode is in use. The bug only affects full-compliance floating-point mode: you do not need this workaround if you are using "runfast" mode. Please contact ARM for further details.
This workaround is enabled for scalar code by default for pre-ARMv7 architectures, but disabled by default for later architectures. If you know you are not using buggy VFP11 hardware, you can disable the workaround by specifying the linker option `--vfp-denorm-fix=none'. If you are using VFP vector mode, you should specify `--vfp-denorm-fix=vector'.
If the workaround is enabled, instructions are scanned for potentially-troublesome sequences, and a veneer is created for each such sequence which may trigger the erratum. The veneer consists of the first instruction of the sequence and a branch back to the subsequent instruction. The original instruction is then replaced with a branch to the veneer. The extra cycles required to call and return from the veneer are sufficient to avoid the erratum in both the scalar and vector cases.