GB2639989A

GB2639989A - Authentication code setting instruction and authentication code checking instruction

Info

Publication number: GB2639989A
Application number: GB2404673.2A
Authority: GB
Inventors: Eapen Jacob; John Smith Bradley; Christopher Jacques Botman François; Christopher Grocutt Thomas
Original assignee: ARM Ltd; Advanced Risc Machines Ltd
Current assignee: ARM Ltd
Priority date: 2024-04-02
Filing date: 2024-04-02
Publication date: 2025-10-08
Also published as: WO2025210337A1; GB202404673D0

Abstract

In response to an authentication code generating class of instruction, processing circuitry generates an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value. For an authentication code setting instruction, the generated authentication code is written to a destination register. For an authentication code checking instruction, the processing circuitry checks whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register and triggers an error response when they do not correspond. The first modifier value comprises a stack-pointer dependent value. For at least one of the authentication code setting/checking instructions, the second modifier value comprises a program-counter-dependent value.

Description

AUTHENTICATION CODE SETTING INSTRUCTION AND AUTHENTICATION CODE

CHECKING INSTRUCTION

The present technique relates to the field of data processing.

Some processing architectures support a class of authentication code generating instructions for generating an authentication code associated with a target value (such as an address pointer or function return address). This can help protect against return oriented programming (ROP) attacks or other attacks aimed at causing incorrect program behaviour by tampering with an address pointer or return address while it is stored in a memory system.

At least some examples of the present technique provide an apparatus comprising: instruction decoding circuitry to decode instructions; and processing circuitry to perform a processing operation in response to a decoded instruction decoded by the instruction decoding circuitry; in which: in response to an authentication code generating class of instruction, the processing circuitry is configured to generate an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; the authentication code generating class of instruction including: an authentication code setting instruction, for which the processing circuitry is configured to write the generated authentication code to a destination register separate from a register providing the authentication target value for the authentication code setting instruction; and an authentication code checking instruction, for which the processing circuitry is configured to check whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register separate from a register providing the authentication target value for the authentication code checking instruction, and trigger an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a program counter register.

At least some examples of the present technique provide computer-readable code for fabrication of an apparatus as discussed above. The code can be stored on a computer-readable storage medium. The storage medium may be a non-transitory storage medium.

At least some examples of the present technique provide a method comprising: performing a processing operation in response to a decoded instruction; in which: in response to the decoded instruction being an authentication code generating class of instruction, the processing operation comprises generating an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; in response to the decoded instruction being an authentication code setting instruction of said authentication code generating class of instruction, the processing operation comprises writing the generated authentication code to a destination register separate from a register providing the authentication target value for the authentication code setting instruction; and in response to the decoded instruction being an authentication code checking instruction of said authentication code generating class of instruction, the processing operation comprises checking whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register separate from a register providing the authentication target value for the authentication code checking instruction, and triggering an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a program counter register.

At least some examples provide a computer program for controlling a host data processing apparatus to provide an instruction execution environment for execution of target program code, the computer program comprising: instruction decoding program logic to decode an instruction of the target program code and control the host data processing apparatus to perform a processing operation in response to the decoded instruction; in which: in response to an authentication code generating class of instruction, the instruction decoding program logic is configured to control the host data processing apparatus to generate an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; the authentication code generating class of instruction including: an authentication code setting instruction, for which the instruction decoding program logic is configured to control the host data processing apparatus to write the generated authentication code to a simulated destination register separate from a simulated register providing the authentication target value for the authentication code setting instruction; and an authentication code checking instruction, for which the instruction decoding program logic is configured to control the host data processing apparatus to check whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a simulated source register separate from a simulated register providing the authentication target value for the authentication code checking instruction, and trigger an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a simulated stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a simulated program counter register.

Further aspects, features and advantages of the present technique will be apparent from the following description of examples, which is to be read in conjunction with the accompanying drawings, in which: Figure 1 illustrates an example of an apparatus; Figure 2 illustrates an example of nested function calls; Figure 3 illustrates processing operations performed in response to an authentication code setting instruction and an authentication code checking instruction of an authentication code generating class of instructions; Figure 4 illustrates steps for performing a branch target check; Figure 5 illustrates how an immediate value specified by the authentication code checking instruction offers a limited range for an adjustment to a program counter address; Figure 6 illustrates an example of an authentication code generating class of instruction which applies a correction to generate one of a first modifier and a second modifier; Figure 7 illustrates an example in which the authentication code checking instruction specifies a correction value to be applied to a stack pointer to generate a first modifier value; Figure 8 illustrates an example in which the authentication code setting instruction specifies a correction value to be applied to a program counter address to generate a second modifier value; Figure 9 illustrates an example in which the authentication code setting instruction specifies a second modifier value using a general purpose register; and Figure 10 illustrates a simulation example.

An apparatus comprises instruction decoding circuitry to decode instructions; and processing circuitry to perform a processing operation in response to a decoded instruction decoded by the instruction decoding circuitry. For example, the instruction decoding circuitry and processing circuitry may support instructions defined according to a given instruction set architecture. In response to an authentication code generating class of instruction being decoded by the instruction decoding circuitry, the processing circuitry generates an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value. The authentication code generating class of instruction includes an authentication code setting instruction, for which the processing circuitry is configured to write the generated authentication code to a destination register separate from a register providing the authentication target value for the authentication code setting instruction; and an authentication code checking instruction, for which the processing circuitry is configured to check whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register separate from a register providing the authentication target value for the authentication code checking instruction, and trigger an error response when the generated authentication code does not correspond to the reference authentication code. These instructions can be useful for protecting addresses or other values when stored out to memory (e.g. on a stack data structure) by software. The authentication code setting instruction can be included in the software code before the point at which the authentication target value (the value to be protected using the authentication code) is saved out to memory, and the authentication code checking instruction can be included in the software code after the point at which the authentication target value is loaded back from memory. The use of a separate register to store the generated authentication code for the authentication code setting instruction or the reference authentication code for the authentication code checking instruction, separate from the register used to provide the authentication target value itself, can be helpful for instruction set architectures where there are not enough spare bits in one register to accommodate an authentication code in the same register as the authentication target value. For example, for a 32-bit architecture, it is likely that address values to be protected as the authentication target value use up all 32 bits of one register and so a separate register is used to store the associated authentication code.

The code generating function used to generate the authentication code may accept as inputs the authentication target value to be protected, a key, and first and second modifier values. Use of at least one modifier value in the code generating function can be helpful to reduce risk of a value protected with a given authentication code being used in a different scenario from the expected use of that value. For example, a function return address might be used for a branch other than the return branch for the function which caused the function return address to be set, or a data address pointer might be used in a different part of the program from the part expected to use that pointer. The modifier value can be based on a property associated with the current point of execution at which the authentication code is generated.

The first modifier value may comprise a value dependent on the stack pointer obtained from the stack pointer register. By using a stack-pointer-dependent value as a modifier in the code generating function, different authentication codes may be generated based on the same value of the authentication target value when the authentication code setting instruction is executed at different stack depths (e.g. corresponding to different nested positions within a nested set of function calls), which can be helpful for detecting cases where the authentication target value is used in a different manner to the one expected. For example, when protecting a function return address as the authentication target value, the stack pointer based modifier can be useful to tie that particular function return address more closely to a particular instance of the function call. However, it has been recognised that relying solely on a stack pointer to differentiate different use cases for a given authentication target value may not differentiate different functions called at the same stack depth. Hence, a second modifier is also provided which, for at least one of the authentication code setting instruction and the authentication code checking instruction, comprises a value dependent on a program counter address obtained from a program counter register. By considering a program-counter-dependent modifier, this can tie the authentication codes more precisely to a specific instance of a function, reducing risk that an attacker is able to circumvent the authentication check by substituting the authentication target value and authentication code actually intended to be used for a given instance of a function with a previously captured pair of authentication target value and authentication code captured on a different instance of a function, since at least one of the stack pointer and program counter is likely to be different between the different instances of the function. Hence, by providing support in an instruction set architecture for an authentication code generating class of instructions supporting use of both a stack-pointer-dependent value and a program-counter-dependent value as modifiers in the authentication code generating function, this supports software to be given greater protection against attack based on tampering with address values or other authentication target values stored in memory.

It is not essential for both the authentication code setting instruction and the authentication code checking instruction to use a second modifier value which depends on the program counter address. In practice, to ensure that the authentication code checking instruction generates the same authentication code value as the authentication code setting instruction when the authentication target value is used within the scope of the expected use case, software would need to ensure that the modifier values for the code generating function should be the same for both the authentication code checking instruction and the authentication code setting instruction.

However, the responsibility to ensure that the modifiers are the same for the corresponding setting/checking instructions would lie with the software executed on the apparatus, not the hardware platform provided by the apparatus, which does not need to provide any circuit feature to guarantee that the same modifiers would be used for both instructions. Hence, the authentication code setting instruction may trigger the processing circuitry to carry out a first set of one or more operations implementing an authentication code setting function, and the authentication code checking instruction may trigger the processing circuitry to carry out a second set of one or more operations implementing an authentication code checking function. From the point of view of the hardware platform, each instruction triggers an independent set of operations, which take input values defined using registers or other parameters of the instruction, with no particular guarantee of how those values were generated by any earlier instructions. This means that, while it is useful for at least one of the authentication code setting instruction and the authentication code checking instruction to obtain as its second modifier value a value dependent on the program counter address, it is not necessary for both the authentication code setting function triggered by the authentication code setting instruction and the authentication code checking function triggered by the authentication code checking instruction to depend on a value obtained from the program counter register. In some examples, one of the authentication code setting function and authentication code checking function could obtain its second modifier value based on a value stored in a general purpose register, which is not guaranteed in the instruction set architecture to necessarily be dependent on the program counter address. In this case, the software developer or compiler would, to ensure the authentication code check can pass when the authentication target value is used correctly, need to ensure that other instructions are executed around the authentication code setting instruction or authentication code checking instruction to cause the value in the general purpose register to be set to a value that causes the second modifier to have the same value as the corresponding program-counter-dependent second modifier used by the other of the authentication code setting instruction or authentication code checking instruction. However, as noted above the particular sequence of instructions chosen to be executed in a particular software program by the software developer or compiler is not a feature of the hardware apparatus. Hence, it is not essential for the apparatus itself to comprise circuitry which requires both the authentication code setting/checking instructions to support a program-counter-dependent modifier.

In some examples, the authentication code checking instruction may specify an immediate value, and one of the first modifier value and the second modifier value for the authentication code setting instruction depends on the immediate value applied as an adjustment to the stack pointer or the program counter address. Applying an adjustment based on an immediate value can be useful for allowing the authentication code check to be successful when the current value of the stack pointer or program counter address is expected to be different for the authentication code setting instruction and authentication code checking instruction respectively. Use of an immediate value (a value encoded directly in the instruction encoding, rather than being obtained from a register specified implicitly or explicitly by the instruction) can help conserve architectural registers for other purposes, which can be particularly useful in an architecture with constrained availability of architectural registers for specifying instruction operands.

In some examples, one of the authentication code setting instruction and the authentication code checking instruction specifies a correction value for applying a correction to one of the stack pointer and the program counter address to generate a corresponding one of the first modifier and the second modifier. This correction could be in addition to, or instead of, the adjustment based on the immediate value as mentioned previously. It can be useful to support the ability to correct one of the stack pointer and the program counter address when generating the first modifier or the second modifier, to allow for use cases where the current value of the stack pointer or program counter address is expected to be different for the authentication code setting instruction and authentication code checking instruction respectively.

In particular, in some examples, the correction based on the correction value may enable the authentication code setting instruction and the authentication code checking instruction to be separated by an instruction address offset greater than a maximum instruction address offset representable using an immediate value of the authentication code checking instruction. While one might think that an immediate value would be considered the most appropriate mechanism by which the program counter offset between the addresses of the authentication code setting instruction and the authentication code checking instruction can be accounted for to allow the same modifier value to be used by the code generating functions of both types of instruction, an immediate value offers a limited range by which the program counter address can be corrected, and in some use cases with relatively long snippets of code executed within a function (which can be common when techniques such as loop unrolling or function inlining is used by compilers), that range may be insufficient to account for the address offset between the instruction addresses of the authentication code setting instruction and authentication code checking instruction. One might think that a simple solution could be to provide an instruction for which a general purpose register is used to define the program-counter-dependent modifier for the authentication code checking instruction, avoiding the need to apply any correction when taking the value from that general purpose register, since other instructions around the authentication code checking instruction could ensure that a value is placed in the general purpose register which is the same as the program counter address associated with the corresponding authentication code setting instruction. However, it has been recognised that this approach can cause a drop in processing performance because it may require more instructions to be executed to implement a given processing function, since compared to a case when an immediate value is used to define the second modifier in relation to the program counter address associated with the authentication code checking instruction, additional instructions may need to be included around the authentication code checking instruction to set the value in the general purpose register, ensure that the second modifier used for the authentication codes checking instruction matches any program-counter-dependent second modifier value used for the authentication code setting instruction, and comply with any restrictions on register availability at the end of a function. This can increase code density (the number of instructions required to carry out a given program functionality).

Hence, it can be helpful to provide an instruction set architecture where one of the authentication code setting instruction and the authentication code checking instruction specifies a correction value for applying a correction to one of the stack pointer and the program counter address to generate a corresponding one of the first modifier and the second modifier. This approach may be seen as counter-intuitive, as compared to the options of merely applying an immediate to a program counter value or taking a modifier value directly from a general purpose register, it may seem to be unnecessary complex to specify an additional correction value operand to be applied to a value obtained from a register to generate a corresponding modifier. However, such a correction can be helpful to improve program code density and improve processing performance in a case where both stack pointer and program counter dependent values are to be used as modifiers in the code generating function. The correction can enable the authentication code setting instruction and the authentication code checking instruction to be separated by an instruction address offset greater than a maximum instruction address offset representable using an immediate value of the authentication code checking instruction.

In some examples, the authentication code checking instruction specifies the correction value as a correction to be applied to the stack pointer to generate the first modifier value for the authentication code checking instruction. This approach may be seen as extremely counter-intuitive, as generally while one might expect the program counter address to vary between the corresponding instances of authentication code setting/checking instructions intended for setting and checking the authentication code for a given target value to be protected, one would normally expect the stack depth to be the same at entry and exit of a given function, so it would be surprising that it is useful to provide a correction to a stack pointer in the authentication code checking instruction. However, it is recognised that enabling the authentication code checking instruction to execute at a different stack depth to the corresponding authentication code setting instruction can be helpful to reduce program code density in architectures associated with a function calling convention which restricts which registers are allowed to be left corrupted at the end of a function call (where "corrupted" means that a register is set to a different value on exit from the function to the value specified by that register on entry to the function).

In some examples, the authentication code setting instruction specifies the correction value as a correction to be applied to the program counter address to generate the second modifier value for the authentication code setting instruction. Again, this approach may be seen as counter-intuitive. Often a given function may have multiple exit points for the same function call entry point, so one would expect that it would be more efficient for an instruction set architecture to define the instructions so that any correction to the second modifier applied to account for the difference in program counter address between the authentication code setting/checking instructions would be applied at the authentication code checking instruction, since this would enable software to be developed in which authentication code checking instructions corresponding to different function call exit points to specify different correction values aligning with the program counter address of the same authentication code setting instruction. However, it is recognised that to deal with the problem where an immediate value correcting the program counter address to generate the second modifier in the authentication code checking instruction may offer a limited range limiting the maximum functional distance between the corresponding instances of the authentication code setting/checking instructions, in an architecture where register usage constraints imposed by the limited number of registers available and any function calling conventions in use limit which registers can be used as general purpose registers by the authentication code checking instruction, it can be beneficial to program code density to provide an authentication code setting instruction which specifies the correction value as a correction to be applied to the program counter address to generate second modifier value. Hence, surprisingly, such an authentication code setting instruction can justify its inclusion in the instruction set architecture (i.e. justify the instruction encoding space consumed by that instruction). By implementing the authentication code setting instruction to cause the processing circuitry, at the time of setting the authentication code, to do some of the work to align the values used as the second modifier value by the authentication code setting/checking instructions, the maximum distance between authentication code setting/checking instructions can be increased and program code density can be improved by reducing the need for as many instructions to be executed around the authentication code checking instruction. Improvements in code density can be helpful in reducing memory storage overheads and associated power consumption, as well as improving processing performance.

In some examples, where the authentication code setting instruction specifies the correction value as a correction to be applied to the program counter address to generate the second modifier value for the authentication code setting instruction, the authentication code checking instruction may also specify an immediate value as a correction to be applied to the program counter address to generate the second modifier value for the authentication code checking instruction. Hence, in this case both the authentication code setting instruction and the authentication code checking instruction apply a correction to the program counter address to generate the respective second modifier values. This allows the second modifier value to take a value other than the address of either the authentication code setting instruction or the authentication code checking instruction, as the second modifier value could be any other address which can be referenced by the authentication code setting/checking instructions by applying their respective corrections relative to their respective values of the program counter address. As well as enabling an increased functional distance between the authentication code setting/checking instructions included in program code for protection of a given authentication target value, providing architectural support for both the setting/checking instructions applying a correction to generate the second modifier value can be helpful for enabling greater security in use cases where just in time translation is used. If only one of the setting/checking instructions applied a correction, and the other was restricted to using its program counter address value as the second modifier, there would be no flexibility for a just in time compiler to vary, between different instances of compiling a given function with the authentication code setting/checking instructions at particular program counter addresses, the address value used as the second modifier for those setting/checking instructions. However, by implementing the setting/checking instructions so that they both cause the processing circuitry to apply a correction to the program counter address based on correction values specified by the respective instructions, this can enable any address in a given range relative to the program counter addresses of the setting/checking instructions to be specified as the second modifier. Hence, a just in time compiler would have flexibility to change the address to be used as the second modifier value between one instance of compiling a given function and the next instance of compiling that same function. This makes it less likely that an attacker could save the authentication code computed on one instance of the function and use that authentication code to cause a successful authentication code check associated with a tampered authentication target value on another instance of the function. Hence, this approach can enable greater security for just in time complied program code.

In some examples, the correction value is specified as an immediate value. This can be helpful to avoid the need for a general purpose register to be consumed for the correction value, which in some use cases can risk increasing program code density due to additional instructions being included in program code to save/restore a value previously specified in that general purpose register for complying with function calling conventions in use.

However, in other examples it may be possible for the correction value to be specified as a value obtained from a general purpose register. For example, in cases when the authentication code setting instruction specifies the correction value, there may be less risk that program code would be made much larger due to use of a general purpose register for providing the correction (in comparison to cases where the authentication code checking instruction specifies the correction value). Also, in some cases it may still be acceptable for the authentication code checking instruction to specify the correction value using a general purpose register.

When one of the authentication code setting instruction and the authentication code checking instruction specifies the correction value using a general purpose register, this general purpose register could be a predetermined general purpose register which does not require explicit identification in the encoding of the instruction. For example, the instruction set architecture may prescribe that a certain general purpose register is used as the general purpose register providing the correction value.

Alternatively, the authentication code setting instruction or the authentication code checking instruction could specify a register field for identifying the general purpose register providing the correction value. This would allow compilers of software to vary which general purpose register is selected as the register providing the correction value.

Similarly, it will be appreciated that any of the following registers could be either a predetermined register not requiring explicit identification in the encoding of an instruction (fixed by design in the instruction set architecture), or could be a variably identified register identified based on a register field of the instruction encoding: * the register providing the authentication target value for the authentication code setting instruction; * the destination register to which the generated authentication code is written in response to the authentication code setting instruction, * the register providing the authentication target value for the authentication code checking instruction; and * the source register from which the reference authentication code is obtained in response to the authentication code checking instruction.

The stack pointer register providing the stack pointer for generating the first modifier value for the authentication code setting instruction or the authentication code checking instruction, and the program counter register providing the program counter address for generating the second modifier value for at least one of the authentication code setting instruction or the authentication code checking instruction may be implicitly defined registers which do not require explicit identification using register fields. For example, the stack pointer and/or program counter address may be implicit operands of the instruction deduced as being required from the opcode identifying the type of instruction being executed.

The correction applied to the stack pointer or the program counter address based on the correction value can be implemented in different ways. In some examples, the correction comprises an arithmetic correction to add or subtract the correction value from said one of the stack pointer and the program counter address.

Alternatively, the correction may comprise a logical correction to generate a corrected value based on a logical combination of the correction value and said one of the stack pointer and the program counter address. For example, the correction value may specify a mask to be applied to bits of the stack pointer or program counter address, to cause one or more bits of the stack pointer or program counter address to be set to particular values to generate the corresponding modifier value. In some examples, the correction value and the stack pointer or program counter address may be combined logically using a given Boolean function (e.g. AND, NAND, OR, NOR, XOR, XNOR, NOT) or combination of such Boolean functions.

In some examples, in response to a non-sequential change of program flow to a branch target address being triggered when a branch target check is enabled, the processing circuitry is configured to trigger an error handling response when the instruction at the branch target address is not a member of a permitted class of branch target instructions. The authentication code setting instruction may be one of that permitted class of branch target instructions. Supporting a branch target check can be helpful to reduce likelihood of an attacker being able to tamper with a target address of a branch to cause branching into an arbitrary point in a program not intended to be a branch entry point. By restricting valid branch entry points to addresses where the instruction at the branch target address is one of a limited set of types of instructions allowed to act as branch target instructions, security can be improved. It can be particularly useful to allow the authentication code setting instruction to also act as one of the permitted class the branch target instructions, because often in implementations employing the authentication code generating instructions to protect function return addresses, the authentication code setting operation is the first operation performed after calling the function, and so having a single instruction perform the authentication code setting operation and also act as a valid "landing pad" for a branch target avoids the need to include two separate instructions for this purpose, thus improving program code density. However, in an instruction set architecture with restricted register availability at the end of a function call, the need to comply with function calling conventions can require additional saving/restoring of register contents to/from the stack which can, in some cases (e.g. where the authentication code setting/checking instructions are to be separated by a large distance in the program code) make it difficult to use the authentication code setting instruction as the very first instruction in the program code executed after calling the function, if the correction based on the correction value was not supported by one of the authentication code setting/checking instructions. This would negate the benefit of the option of using a combined instruction as both an authentication code setting instruction and a permitted branch target instruction. Hence, the option of applying a correction based on the correction value for generating one of the first/second modifiers can be particularly useful when a combined authentication code setting and branch target instruction is supported, because the correction helps to preserve the improvements in code density achievable using the combined authentication code setting and branch target instruction (see program code examples shown later).

In some examples, in response to the authentication code checking instruction when the generated authentication code is detected as corresponding to the reference authentication code, the processing circuitry is configured to trigger a branch to an address specified by an operand of the authentication code checking instruction. Similar to the combined instruction mentioned above for supporting more efficient function entry, it can also be useful for multiple functionalities typically used at the end of a function call (e.g. verification of whether the authentication code associated with a function return address is safe to use, and branching to the function return address if verification is successful) to be combined into one instruction. However, again, in instruction set architectures typically used in conjunction with function calling conventions which impose restricted availability of registers that may remain uncorrupted at the end of the function, without the ability to apply the correction mentioned above it would be difficult to sustain a combined authentication code checking and return branch instruction, as an intervening instruction may be needed to restore the contents of a register used to provide an operand of the authentication code checking operation so that this register can remain uncorrupted at the point when the return branch occurs. However, the correction operation applied for generating one of the first and second modifiers as discussed above can be helpful to allow this intervening instruction to be avoided even if the authentication code setting instruction and authentication code checking instruction are separated by a relatively large distance in the program code, so that the option of a combined authentication code checking and return branch instruction can be retained and overall program code density can be improved.

A number of options are available for defining the authentication code setting instruction and the authentication code checking instruction according to the instruction set architecture supported by the instruction decoding circuitry and the processing circuitry. Any one or more of these options may be supported. It is possible that an instruction set architecture may support more than one of these options in combination (e.g. by supporting multiple variants for the authentication code setting instruction and/or authentication code checking instruction, distinguished by their opcode, a variant identifying value, and/or a modal control bit in a register which controls whether a given instance of the authentication code setting instruction and/or authentication code checking instruction behaves as one variant or another).

In some examples, for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises the program counter address; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction. This option provides for improved security, compared to implementations using only one of the first and second modifiers, by reducing the likelihood that the same authentication code value is generated for different function instances. This option could be useful for function use cases where the function code is short enough that the address offset between the authentication code setting instruction and the authentication code checking instruction is small enough to be within the range able to be referenced using the immediate value of the authentication code checking instruction.

In some examples, for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises the program counter address; and for at least one variant of the authentication code checking instruction, the first modifier comprises a result of correcting the stack pointer based on a correction value specified by the authentication code checking instruction, and the second modifier comprises a value obtained from a general purpose register. This approach can help expand the range of address offsets between the addresses of the authentication code setting instruction and the authentication code checking instruction that can be handled to beyond the limit implied by an immediate value (since the general purpose register allows software to set the second modifier for the authentication code checking instruction to match the program-counter-dependent second modifier used for the authentication code setting instruction). The correction to the stack pointer to generate the first modifier for the authentication code checking instruction enables the authentication code setting instruction and the authentication code checking instruction to execute at points of program flow corresponding to different stack depths, which can help reduce the number of instructions compared to an implementation where the authentication code setting instruction and the authentication code checking instruction have to execute at the same stack depth to allow their first modifiers to match.

In some examples, for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on a correction value specified by the authentication code setting instruction; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction. This approach can offer further benefits for improved code density, by enabling a combined authentication code setting and permitted branch target instruction to be used even when the setting/checking instructions are separated by a relatively large address offset, as well as having a security benefit for just in time compilers as the fact that both the authentication code setting instruction and the authentication code checking instruction apply a correction to the program counter address to generate the second modifier value means an arbitrary intermediate address can be chosen as the second modifier, allowing different instances of compiling the same function to use different second modifiers, reducing risk of an attacker capturing and reusing an authentication code generated for an old instance of the same function to circumvent the authentication code check on a later instance.

The correction value for the authentication code setting instruction could be specified as either an immediate value or using a general purpose register.

In some examples, for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises a value specified in a general purpose register; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction. This approach may simplify the operation for the authentication code setting instruction by avoiding the need to apply an arithmetic or logical operation for generating the second modifier, as the second modifier can be taken directly from a general purpose register. Nevertheless, this approach can still support improved program code density in cases where the distance between the authentication code setting instruction and the authentication code checking instruction is relatively large while a program counter dependent second modifier value is supported by the authentication code checking instruction.

When the authentication code check performed in response to the authentication code checking instruction fails (the generated authentication code does not correspond to the reference authentication code), the error response triggered by the processing circuitry could take various forms. For example, the error response could comprise signalling of a fault, e.g. raising an exception or interrupt which interrupts the processing being performed to switch to execution of an exception handler in a more privileged execution state. In other examples, the error response may not necessarily interrupt processing immediately, but could comprise an action which would later cause execution to fail. For example, the error response could be corrupting the value of the authentication target value to set it to a value which is expected to later cause a fault to be detected. For example, often the authentication target value may be an address (e.g. either an address used as a branch target address, which is expected to subsequently be used for fetching an instruction, or a data address pointer which is expected to subsequently be used to generate an address of a load/store operation). Therefore, one option is that, if the generated authentication code does not match the reference authentication code in response to the authentication code checking instruction, the authentication target value is set to an invalid address (e.g. an address value in an invalid address range not allowed to be used for real instruction or data addresses), so that any subsequent instruction fetch or load/store access based on the invalid address will cause a fault to be detected.

The code generating function may be a cryptographic function applied to the authentication target value based on the key and the first and second modifier values. For example, the cryptographic function may be a cryptographic hash function such one of the variants of the SHA or QARMA algorithms. In some cases, if the authentication target value or either of the first and second modifier values comprise fewer bits than the inputs required for these algorithms, they may be padded with an arbitrary set of bit values (e.g. all bits set to 0) to provide the required length. In some cases, the first and second modifier values may be concatenated or otherwise combined into a combined modifier value prior to input to the code generating function. This can allow a cryptographic hashing function to be applied which may be defined using only one modifier, but by setting that modifier based on both the stack-pointerdependent first modifier value and the second modifier value (dependent on program counter for at least one of the setting/checking instructions) improved security can be provided for the authentication use case by tying the generated authentication code more strongly to a particular function instance. The key for the cryptographic function may be obtained from one or more system registers. In some examples, the key may be updatable by the processing circuitry in response to instructions executed in a particular operating state (e.g. an operating state executed with greater privilege). In some examples, the instruction set architecture may support the key being selected from among two or more alternative keys (e.g. defined in different control registers), selected depending on a parameter of the authentication code generating (setting/checking) instruction and/or on other architectural state such as a current operating state of the processing circuitry or key selecting information stored in a system register.

The techniques discussed above may be implemented within an apparatus which has hardware circuitry provided for implementing the instruction decoding circuitry and processing circuitry discussed above. However, the same technique can also be implemented within a computer program which executes on a host data processing apparatus to provide an instruction execution environment for execution of target code. Such a computer program may control the host data processing apparatus to simulate the architectural environment which would be provided on a hardware apparatus which actually supports target code according to a certain instruction set architecture, even if the host data processing apparatus itself does not support that architecture. The computer program may have instruction decoding program logic which emulates functions of the instruction decoding circuitry and processing circuitry discussed above, e.g. by mapping the claimed authentication code generating (setting/checking) instructions of the target code onto corresponding sets of instructions according to the host instruction set architecture supported by the host data processing apparatus. Register accesses specified for the authentication code setting/checking instructions may be simulated based on mapping the simulated registers of the target instruction set architecture onto registers and/or memory provided by the host apparatus. Hence, references made to registers when discussing a hardware implemented embodiment may, when referring to a corresponding software simulation embodiment be understood as referring to simulated registers mapped onto host storage by register emulating program logic of the simulation computer program.

Such a simulation program can be useful, for example, when legacy code written for one instruction set architecture is being executed on a host processor which supports a different instruction set architecture. Also, the simulation can allow software development for a newer version of the instruction set architecture to start before processing hardware supporting that new architecture version is ready, as the execution of the software on the simulated execution environment can enable testing of the software in parallel with ongoing development of the hardware devices supporting the new architecture. The simulation program may be stored on a storage medium, which may be a non-transitory storage medium.

Specific examples are set out with reference to the drawings.

Figure 1 schematically illustrates an example of a data processing apparatus 2. The data processing apparatus has a processing pipeline 4 (an example of processing circuitry, which could for example form part of a CPU (Central Processing Unit)). The processing circuitry 4 is for executing instructions defined in an instruction set architecture (ISA) to carry out data processing operations represented by the instructions. The processing pipeline 4 includes a number of pipeline stages. In this example, the pipeline stages include a fetch stage 6 for fetching instructions from an instruction cache 8; a decode stage 10 (an example of instruction decoding circuitry) for decoding the fetched program instructions to generate micro-operations (decoded instructions) to be processed by remaining stages of the pipeline; an issue stage 12 for checking whether operands required for the micro-operations are available in a register file 14 and issuing micro-operations for execution once the required operands for a given micro-operation are available; an execute stage 16 for executing data processing operations corresponding to the micro-operations, by processing operands read from the register file 14 to generate result values; and a writeback stage 18 for writing the results of the processing back to the register file 14. It will be appreciated that this is merely one example of possible pipeline architecture, and other systems may have additional stages or a different configuration of stages. For example in an out-of-order processor a register renaming stage could be included for mapping architectural registers specified by program instructions or micro-operations to physical register specifiers identifying physical registers in the register file 14. In some examples, there may be a one-to-one relationship between program instructions defined in the ISA that are decoded by the decode stage 10 and the corresponding micro-operations processed by the execute stage. It is also possible for there to be a one-to-many or many-to-one relationship between program instructions and micro-operations, so that, for example, a single program instruction may be split into two or more micro-operations, or two or more program instructions may be fused to be processed as a single micro-operation.

The execute stage 16 includes a number of processing units, for executing different classes of processing operation. For example the execution units may include a scalar arithmetic/logic unit (ALU) 20 for performing arithmetic or logical operations on scalar operands read from the registers 14; a floating point unit 22 for performing operations on floating-point values; a branch unit 24 for evaluating the outcome of branch operations and adjusting the program counter which represents the current point of execution accordingly; and a load/store unit 26 for performing load/store operations to access data in a memory system 8, 30, 32, 34.

The execute stage 16 also includes a code generating unit 27 for generating an authentication code in response to the authentication code generating class of instructions. For example, the code generating unit 27 may implement a given cryptographic hash function such as SHA or QARMA.

In this example, the memory system includes a level one data cache 30, the level one instruction cache 8, a shared level two cache 32 and main system memory 34. It will be appreciated that this is just one example of a possible memory hierarchy and other arrangements of caches can be provided (and in some implementations, there may not be any such caches 30, 32 at all). The specific types of processing unit 20 to 27 shown in the execute stage 16 are just one example, and other implementations may have a different set of processing units or could include multiple instances of the same type of processing unit so that multiple micro-operations of the same type can be handled in parallel. It will be appreciated that Figure 1 is merely a simplified representation of some components of a possible processor pipeline implementation, and the processor may include many other elements not illustrated for conciseness.

In some examples, the ISA supported by the processing circuitry 4 may be a 32-bit architecture, for which scalar general purpose registers are 32-bit registers and addresses used to identify memory system locations are 32-bit addresses. The registers 14 may comprise a set of general purpose registers available for use for providing general purpose operands for instructions executed by the processing circuitry 4. The registers 14 may also comprise system registers for storing control state information.

For example, the ISA may support sixteen general purpose registers R0-R15, of which R13, R14 and R15 are also used as special purpose registers: R13: SP (stack pointer register), for storing a stack pointer address pointing to a location on a stack data structure stored in the memory system.

R14: LR (link register), for storing a function return address for a function. The function return address is set when calling a function and represents the address to which program flow should return following completion of the function.

R15: PC (program counter register), for storing a program counter address representing an address of an instruction at a current point in program flow.

It will be appreciated that other registers could also be supported.

Figure 2 illustrates an example of program flow involving nested function calls. The flow begins at the instruction address "#addl" corresponding to a branch-with-link instruction BL used to make a function call to a given function, fn A. The BL instruction is a calling branch instruction, which when decoded causes the processing circuitry 4 to write a function return address (the address of the next sequential instruction after BL) to the link register (LR, an alias for R14 as noted above). In this example, instructions comprise 32 bits, so the address of the next instruction is #addl + 4 bytes. Hence, the value of #addl + 4 is pushed to the link register as a function call address. It will be appreciated that the interval between instruction addresses would depend on the number of bits used in the instructions. For example, in an instruction set with 64-bit instruction encodings, the return address could be #addl + 8 bytes.

The program flow then continues through fn A before encountering another BL instruction at the instruction address #add2 to call another function, fn B. Hence, as the BL instruction at #add2 would overwrite the link register, prior to executing the BL at #add2, the software includes at least one store instruction to save the contents of the link register (LR) (i.e. the return address #add1+4 corresponding to fn A) to a software-maintained stack data structure in memory. In some cases, if other register contents associated with fn A are at risk of being overwritten by fn B, these may also be stored to the stack before calling fn B. The stack pointer in the SP register (R13) can be used to identify the addresses in memory to which the contents of the LR and any other registers are to be written. When data is pushed to the stack, the stack pointer is updated accordingly.

After saving any register contents that need to be preserved for fn A to the stack, the BL instruction at #add2 is executed which causes the address (#add2 + 4) of the next sequential instruction to be written to the link register as another function return address, and causes the program flow to branch to the target address of that BL instruction, corresponding to the start of function fn B. At the end of fn B, a return branch instruction is encountered. The return branch obtains its target address from the link register, so when taken directs the program flow back to #add2 + 4. In other words, the section of code in fn A is resumed from the point after the instruction that called fn B. Having returned to fn A, the software code within fn A includes at least one stack-pointer-controlled load instruction to pop the previously saved return address (and any other register contents) from the software-maintained stack in memory. When data is popped from the stack, the stack pointer is updated in the opposite direction to the direction in which the stack pointer is updated when pushing data to the stack (hence, the stack is managed as a last-in-firstout structure). The return address (#add1+4) popped from the stack is restored to the link register (and other register contents can also be restored based on information saved to the stack before calling fn B). At the end of fn A, another return branch instruction is encountered and this causes a return branch to a target address obtained from the LR, i.e. program flow now resumes from #addl + 4, the instruction after the calling branch which originally called fn A. To ensure that "callee" function code provided by one developer can be called by "caller" program code provided by another developer, the ISA supported by the processing circuitry 4 may define, or be associated with, certain function calling conventions setting agreed rules on how the general purpose registers can be used by the caller code and callee code respectively.

This helps the developers of both the caller code and the callee code understand whether it is necessary for them to save or restore contents of certain registers to/from the stack before or after a function call or function return. For example, it may be agreed that general purpose registers RO-R3 are available for use as function arguments or function return values, for allowing the caller and callee code to pass values between each other. Registers R4-R11 may be callee-saved registers which are required to remain uncorrupted at the end of the function, so that upon executing a return branch to return to the caller code, the caller code can assume that these callee-saved registers will specify the same register contents as before the corresponding function call was made. Hence, if the callee code wishes to modify the contents of the calleesaved registers R4-R11 (relatively likely due to register pressure), it needs to include stack push instructions to save the caller-defined contents of these registers to the stack before overwriting the caller-defined contents, and stack pop instructions to restore the caller-defined contents of these registers from the stack before executing the return branch. On the other hand, register R12 may be a caller-saved register which is allowed to be corrupted during the function call, so that the caller code cannot guarantee that, on returning from the function, R12 still stores the same value that was specified in register R12 before calling the function. Hence, the callee code implementing the function does not need to include any stack save/restore instructions for protecting the contents of register R12, and instead if the caller code needs to rely on retaining the value of R12 from before the function call, the caller code would need to include a stack push instruction before the function calling branch BL for calling the function, and include a stack pop instruction after the corresponding function return. Registers R13-R15 are reserved for special purposes, as the LR, SP and PC respectively as mentioned above.

Hence, the function calling convention constrains the way in which registers can be used by the software executing on the apparatus 2. It will be appreciated that the particular split of registers into caller-saved registers and callee-saved registers may vary and so the example above with R4-R11 as callee-saved registers and register R12 as caller-saved register is just one possible implementation.

It will also be appreciated that the function calling convention may not necessarily be reflected in any particular circuitry of the apparatus 2, and may merely be an understood set of rules to be observed by software developers when developing software to execute on the apparatus 2. However, once an understood function calling convention is established, the function calling convention commonly in use for existing software code may be a constraint to be considered when designing future updates to the ISA supported by the processing circuitry 4, as it may be preferable that legacy code written according to the function calling convention continues to function even when new instructions are supported.

Figure 2 shows how nesting of function calls may require software to include instructions for saving/restoring register contents to/from a stack structure in the memory system. When information is stored in the memory system, it is more vulnerable to attack than when it is stored in the registers 14, because depending on the system implementation an attacker may be able to gain physical access to the memory system to tamper with the stored information and/or may be able to compromise program code involving memory usage errors or other security vulnerabilities to cause data stored out in the stack structure to be overwritten incorrectly. Such attacks may risk incorrect program function and loss of data security.

For example, if the return address pushed from the link register is modified while stored on the stack, then when it is later loaded back into the link register the corresponding return branch will branch to the wrong location in memory, causing execution of an incorrect instruction which could potentially be an instruction from "gadget" code provided by the attacker to cause malicious operations to be performed. This can be known as a "return oriented programming" (ROP) attack. Similar attacks may be a risk when a data address pointer is stored to the stack from a register, and is modified while stored in the memory system, so that after subsequently restoring that pointer to the registers, a load/store operation is performed to the wrong address in memory, potentially allowing access to a memory system location that should not have been accessed.

Therefore, it can be useful for the ISA to provide a countermeasure against such attacks. As shown in Figure 3, the instruction decoder 10 and processing circuitry 4, 16 may support a class of authentication code generating instructions which includes at least one variant of an authentication code setting instruction and at least one variant of an authentication code checking instruction. It is possible that other types of authentication code generating instructions could also be supported.

In general, each variant of authentication code generating instruction, when decoded by the instruction decoding circuitry 10, controls the processing circuitry 4 to generate an authentication code corresponding to an authentication target value obtained from a source register, based on a code generating function applied to the authentication target value, a cryptographic key, a first modifier value and a second modifier value.

If the authentication code generating instruction is an authentication code setting instruction (PAC instruction), the generated authentication code is written to a destination register which is a general purpose register separate from the register used as the source register providing the authentication target value.

If the authentication code generating instruction is an authentication code checking instruction (AUT instruction), the generated authentication code (generated from the authentication target value obtained from a first source register) is compared with a reference authentication code obtained from a second source register which is a different general purpose register to the register providing the authentication target value, and if the generated authentication code does not match the reference authentication code, then an error response is triggered. For example, the processing circuitry 4 triggers an exception or fault condition which may cause interruption of processing and branching to an exception handler to deal with the cause of the fault.

In some examples, a variant of the PAC/AUT instructions can be provided where the source register used to provide the authentication target value is a fixed register which is implicit as part of the ISA instruction definition, so does not require a register field to identify the source register providing the authentication target value. For example, for one variant, the authentication target value could implicitly be the link register (LR) so that the authentication target value protected against tampering using the authentication code is a function return address. It is also possible to provide at least one variant of the PAC/AUT instructions for which a register field instruction encoding specifies which source register provides the authentication target value. This could be used, for example, for cases where a data address pointer is to be protected using an authentication code.

Similarly, the destination register of the PAC instruction or the second source register providing the reference authentication code for the AUT instruction could be either an implicitly defined predetermined general purpose register not requiring a specific register field in the instruction, or could be variably selected by software using a register field encoded in the instruction. Given the function calling convention mentioned above, it can be useful if the register used for the destination register of the PAC instruction or the second source register of the AUT instruction is one of the caller-saved registers, e.g. register R12 in the example above, as this avoids any need to restore contents of R12 after executing the AUT instruction, which could be helpful to enable use of combined branch/AUT instructions as discussed later on. Hence, in some examples the destination register of the PAC instruction and second source register of the AUT instruction may be fixed to be register R12 by default.

The cryptographic key used for the code generating function is obtained from at least one system register which is updatable by instructions executed in at least one operating state of the processing circuitry (not all operating states may allow update of the cryptographic key -e.g. write access to the registers storing the key may be restricted to more privileged operating states). The cryptographic key may, for example, be a value with more bits than the number of bits in one general purpose register. For example, the authentication target value, first modifier value and second modifier value may be 32-bit values, but the key may be a longer value, e.g. 64 bits or 128 bits. In some cases, the cryptographic key may be selected from among two or more alternative keys based on a current operating state of the processor, information specified in the PAC/AUT instruction itself (e.g. a key selecting field or an opcode distinguishing variants of PAC/AUT functions which are to use different keys), and/or other register state stored in registers 14.

The first and second modifier values are used to tweak the cryptographic hash function used as the code generating function, to increase the likelihood that even if the authentication target value and key are the same for two different function call scenarios, different authentication codes can be generated in the code generating function to allow the authentication code to be specific to a particular instance of a function call. This greatly reduces the probability that an attacker can circumvent the authentication code check of the AUT instruction by saving an authentication code generated for one instance of a function call and reusing that previously authentication code when trying to seek authentication of a tampered authentication target value in another instance of a function call. The first modifier value for the PAC and AUT instructions depends on a current stack pointer value specified in the stack pointer register SP at the time the PAC/AUT instructions are executed. For at least one of the PAC and AUT instructions (not necessarily both as shown in subsequent examples), the second modifier value depends on a current program counter address value from the PC register. By using both stack-pointer-dependent and program-counter-dependent tweaks to the code generating function, this can provide greater security because the authentication code is tied to a specific instance of a function at a given address (represented by the program counter address) when called at a given function call depth (represented by current value of the stack pointer), which given a reasonable secure code generating function (such as SHA or QARMA) makes it statistically improbable that an attacker can substitute a previously saved authentication code for a current authentication code and use that to circumvent authentication code checks.

The particular way in which the modifier values are used to tweak the cryptographic hash function can vary depending on the particular algorithm used. However, in general a cryptographic hash function may accept as an input at least one modifier value. Hence, the at least one modifier value of the cryptographic hash function may be set based on bits obtained from the first modifier value and second modifier value. For example, the first modifier value and second modifier value can be concatenated to form the modifier for the cryptographic hash function (and if necessary to expand the concatenated value to the required bit width, padded with additional bits equal to a given value such as 0 or 1 or any arbitrary sequence of bits). Alternatively, if the combined bit width of the first modifier value and second modifier value is greater than the modifier width supported by the cryptographic hash function used, some bits could be omitted or bits from the first modifier value and second modifier value could be combined using a logical operation such as a Boolean function.

In an expected use case, at least one instance of the PAC instruction would be included in program code at, or shortly after, entry into a callee function (e.g. at or shortly after the points marked fn A or fn B in Figure 2), and at least one instance of an AUT instruction would be included in software code at, or shortly before, the return branch at the end of a callee function (e.g. at or just before the return branch back to #add2 + 4 or #addl + 4 in Figure 2). Having generated the authentication code in response to the PAC instruction, if necessary to free up the register to be used for other values, a subsequent stack push instruction may push the generated authentication code to the stack and a stack restore instruction may restore the authentication code to a register before executing the corresponding AUT instruction. Although this may make the authentication code itself vulnerable to attack, the cryptographic hash function may be designed to make it sufficiently improbable that the attacker can modify both the authentication target value (e.g. function return address or data address pointer) and the corresponding authentication code in a manner which enables the authentication code generated from the tampered authentication target value and the tampered authentication code used as reference authentication code for the AUT instruction to match in the authentication code check.

Figure 4 illustrates another security measure, for implementing a branch target check. Program code may be written expecting that branch instructions should branch only to certain designated function entry points. However, an attacker may try to compromise the branch target address of a function calling branch BL or other type of branch so that a non-sequential transition of program flow occurs which causes branching to a different address not intended by the software developer to be the target of that branch. The attacker could use this to try to circumvent certain protections implemented in software near the start of a function, by branching to an arbitrary point in the middle of the function. Hence, it may be desirable to restrict the addresses to which valid branches are allowed to be taken. The processing circuitry 4 may therefore support the option of enabling a branch target check to be performed following a non-sequential change of program flow. Such a non-sequential change of program flow could be caused by an explicit branch instruction (e.g. an instruction with an opcode indicating a branch type of instruction), but could also be triggered by other types of instructions (e.g. arithmetic instructions or load instructions) which are not specifically a branch instruction, but which could for one encoding of a register field specify the program counter (PC) register R15 as their destination register so that the update to the PC causes the next instruction to be fetched from an address that is not sequential with the address of the previously executed instruction (e.g. the next instruction address may be not sequential if the offset relative to the address of the previous address by an amount, e.g. 4, corresponding to the size of one instruction). In some cases branch target checks may be considered to be permanently enabled. In other examples, branch target checks may not always be enabled. For example, whether a branch target check is enabled for a given instance of a branch or other non-sequential change of program flow could depend on the type of instruction executed to cause the non-sequential change of program flow and/or on control state information stored in registers 14 (e.g. information defining a current operating state or mode).

Hence, as shown in Figure 4, if at step 50 a non-sequential change of program flow is detected to a given branch target address when the branch target check is enabled, at step 52 the processing circuitry 4 determines whether the instruction at the branch target address is a member of a permitted class of branch target instructions. The ISA may support at least one branch target instruction that is designated as being allowed to be the target of a non-sequential change of program flow when branch target checks are enabled. Other instructions may not be considered to be a member of the permitted class of branch target instructions. For example, in some cases a permitted branch target instruction can be an instruction which, other than marking the valid branch entry point, does not have any other architectural function, and behaves architecturally as a no-operation (NOP) instruction, so leaves all architectural state unchanged. The permitted class the branch target instructions could also include at least one type of branch target instruction that also has an architectural function to cause the processing circuitry to perform an associated processing operation. For example, a variant of the PAC instruction, indicated as a PACBTI instruction (pointer-authentication-code / branch target instruction), may serve as both the PAC instruction mentioned earlier and may also be treated as a valid branch target when branch target checks are enabled. The ISA could also support at least one non-BTI variant of a PAC instruction which does not serve as one of the permitted class of branch target instructions. Hence, if the instruction at the branch target address is one of the permitted class of branch target instructions (e.g. PACBTI or another type of permitted branch target instruction), then at step 54 the branch target check is considered to have passed and there is no need to trigger any error handling response. Program flow may continue from the instruction at the branch target address. If the instruction at the branch target address is not one of the permitted class a branch target instructions, then the branch target check is considered to have failed and an error handling response is triggered (e.g. a fault is signalled), to prevent execution continuing beyond the point at which the failed branch target check is detected (since the failed branch target check is an indication that there may have been an attack).

The use of a combined PACBTI instruction which serves as both an authentication code generating instruction and a valid landing pad enabling the branch target check to be passed following a non-sequential change of program flow can be helpful for reducing the total number of instructions needed in program code which uses both the authentication code generating instructions and the branch target check.

Similarly, as shown in the dashed lines in Figure 3, a variant of the AUT instruction, denoted as an indirect-branching AUT instruction, or BXAUT instruction, can be supported which combines the authentication code checking operation described above for the AUT instruction with a branch instruction to a branch target address specified using an operand of the BXAUT instruction. For example, the operand used to define the branch target address could be the same as the authentication target value provided for the AUT operation, so that the BXAUT operation both checks whether a return address provided as the authentication target value is consistent with the corresponding reference authentication code provided as a second operand, and conditional on the validation of the authentication codes is successful, triggers a branch to the return address provided as the authentication target value. Again, use of a combined BXAUT instruction can be helpful to improve code density by reducing the number of instructions needed to perform both the authentication code checking functionality and the branch.

Hence, in one example use case, the combined PACBTI and BXAUT instructions can be used in a program code sequence as follows: Program code example 1 label: 1: pacbti r12, lr, SP, Pc 2: push frO, lr} 3: [ ] 4: pop {r0, lr} 5: bxaut r12, lr, sp, <label> where instruction 1 is a PACBTI instruction specifying (implicitly or explicitly) destination register r12 as the destination register to be updated with the generated authentication code generated based on the authentication target value (function return address in link register Ir), key, first modifier (equal to stack pointer sp) and second modifier (equal to program counter pc).

- instruction 2 pushes register contents (including the function return address in link register Ir) to the stack, in case these registers are modified within the function body code.

instruction(s) 3 represents the function body code, which can comprise any arbitrary instructions depending on the purpose of the function.

- instruction 4 restores the link register by popping the return address from the stack.

instruction 5 is a BXAUT instruction which compares the reference authentication code obtained from the second source register r12 with a generated authentication code generated from the restored return address in link register Ir (authentication target value in a first source register), the stack pointer as a first modifier value, and a second modifier value derived by applying an immediate offset (represented by <label> in the mnemonic above) to the current program counter value. If the generated authentication code matches the reference authentication code, a branch is triggered based on the function return address in the link register. If the generated authentication code does not match the reference authentication code, an error response is taken.

Example 1 above is best for performance if the number of instructions at point 3 in the function code body is small enough that the immediate value used to define <label> in the BXAUT instruction has enough bits that the second modifier value of the BXAUT instruction can match the program counter value PC from the PACBTI instruction.

However, as shown in Figure 5, an immediate value #imm defined in the instruction encoding of the BXAUT instruction may have a limited range, as there may only be a limited number of bits available for encoding the immediate value in the instruction encoding. In practice, compiling techniques such as loop unrolling (explicitly defining sets of instructions multiple times for respective loop iterations of the same high level program code loop to reduce the overhead of loop control instructions) and function inlining (inserting instructions corresponding to a called function explicitly into the middle of a corresponding caller sequence so that the caller-callee-caller sequence is processed using sequentially processed instructions without branching, to eliminate the overhead of the function call/returns) mean that it is relatively common that the address offset between the PC addresses of the PACBTI instruction and BXAUT can exceed the range available for referencing using the immediate value of the BXAUT instruction.

One might think that a solution to this problem would be to use a general purpose register in the BXAUT instruction to identify the second modifier value, so that prior to executing the BXAUT instruction some instructions can be included to set this general purpose register to the value corresponding to the program counter used as second modifier value for the corresponding PACBTI instruction. However, a problem with this approach is that the function calling convention used may restrict how many registers are available to be corrupted at the end of the function code. As noted above, in some cases only R0-R3 (which may be needed for function arguments or return values, so may not always be available) and R12 (which may already be needed for providing the reference authentication code for the BXAUT instruction) may be allowed to remain corrupted at the point when the function return branch is triggered, so this does not leave any spare registers allowed to be uncorrupted in cases where R0-R3 are already needed for function return values generated by the function code body to be available to the caller after the return branch.

Hence, without support for variants of the PACBTI/BXAUT instructions as discussed further below, this may preclude being able to use the combined PACBTI and BXAUT instructions and the software developer/compiler may have to resort to use of separate instructions for representing the branch entry point (BTI) and the authentication code setting operation (PAC), and separate instructions for representing the authentication code checking operation (AUT) and the subsequent return branch (BX). For example, this may lead to a program code implementation as follows: Program code example 2 label: la: bti lb: str r2, [sp, #-4]! lc: pac push r12, 1r, sp, pc 2: {r0, lr} 3: [...] 4: pop {r0, lr} 5a: adr r2, (label> 5b: aut r12, 1r, sp, r2 5c: ldr r2, [sp],#4 5d: bx lr where: - instructions 2-4 are the same as in Example 1 the previous instruction 1 (PACBTI) has been replaced with 3 instructions: o la: the BTI instruction representing the valid branch entry point o lb: a store instruction to preserve the contents of register R2 by saving them to the stack; o lc: a separate PAC instruction implementing the PAC operation as in instruction 1 above, but without also acting as a valid branch target instruction.

the previous instruction 5 (BXAUT) has been replaced with 4 instructions: o 5a: ADR instruction generates, as the second modifier value to be used for the subsequent AUT instruction, an address computed by applying the immediate offset <label> to the program counter address associated with the ADR instruction.

While the ADR instruction is shown immediately preceding the AUT instruction, it can be moved earlier in the function code if necessary to ensure that the address offset between the PAC instruction lc and the ADR instruction 5a is within the range capable of being specified by the immediate value <label> for the ADR instruction. This could require additional instructions for saving/restoring the generated address value if register R2 will need to be used for other purposes between the point at which the ADR instruction is included and the point at which register R2 is used to provide the modifier for the AUT instruction.

o 5b: AUT instruction carrying out authentication code checking operation as in the BXAUT instruction above, but without triggering a branch if the check is successful.

o 5c: LDR instruction to restore the contents of register R2 from the stack (that were previous saved by the store instruction STR at lb) o 5d: BX instruction to trigger the return branch to the return address in the link register LR, which has previously been verified using the AUT instruction at 5b. Hence, in this case, use cases where the PAC/AUT instructions are separated by greater than the maximum range supported by the immediate value <label> and where there are no spare registers available to be corrupted at the end of the function requires an additional 5 instructions per function call, since 2 instructions 1 and 5 in the previous example have been replaced with 7 instructions la-lc and 5a-5d in this example (at least-as noted above depending on the position at which the ADR instruction 5a is inserted, additional instructions might be needed to push/pop register R2 to the stack to preserve the generated second modifier value for later use by the AUT instruction).

Figure 6 illustrates an encoding option which enables this problem to be addressed. Figure 6 illustrates functionality of an authentication code generating instruction, which could be either the authentication code setting instruction (PAC or PACBTI) or the authentication code checking instruction (AUT or BXAUT). As in Figure 3, a generated authentication code is generated by applying a code generating function to the authentication target value obtained from a source register, a key, a first modifier value and a second modifier value. Again, the first modifier value is dependent on the current stack pointer value in the SP register and the second modifier value for at least one of the PAC[BTI] and [BX]AUT instructions is dependent on the current program counter address in the PC register (here the notation PAC[BTI] refers to a PAC instruction which could optionally also act as a branch target instruction, and notation [BX]AUT refers to an AUT instruction which could optionally also act as an indirect branch instruction, i.e. PAC[BTI] is shorthand for PAC or PACBTI and [BX]AUT is shorthand for AUT or BXAUT).

However, in the example of Figure 6, for one of the PAC/AUT instructions, one of the first modifier value and second modifier value is corrected by applying a correction based on a correction value specified by that one of the PAC/AUT instructions. This correction could be instead of, or in addition to, an adjustment of the program counter based on an immediate value #imm specified by the [BX]AUT instruction discussed with respect to Figure 5, and in general is a correction applied to enable the first and second modifiers to be equal for the corresponding instances of the PAC[BTI] and [BX]AUT instructions even when the instructions are separated by a greater address offset than could be referenced by a single immediate value in the [BX]AUT instruction. The correction could be applied arithmetically (e.g. by adding the correction value to the one of the stack pointer (SP) and program counter (PC) being corrected or subtracting the correction value from the one of the SP and PC being corrected), or could be applied logically, e.g. by using the correction value as a bitmask indicating bits of the SP/PC to be masked out or by otherwise combining the correction value and the SP or PC according to a Boolean function or combination of one or more Boolean functions.

Figures 7 to 9 show several alternative variants of the authentication code setting instruction (PAC or PACBTI) and authentication code checking instruction (AUT or BXAUT) that can be supported to enable improved code density of implementations which face the problem of limited immediate range and restricted register availability as discussed above. In each of these examples, apart from the way in which the first and second modifier values are generated based on operands of the instructions, the functionality of the PAC[BTI] and [BX]AUT instructions is the same as discussed earlier.

Figure 7 shows a variant where the PAC[BTI] instruction uses the SP as the first modifier value and the PC as the second modifier value (without correction in both cases). The [BX]AUT instruction in this example specifies the correction value using an immediate value and applies the correction based on the correction value to the stack pointer to generate the first modifier value. The second modifier value for the [BX]AUT instruction is a value obtained from a general purpose register, which could be a fixed general purpose register (e.g. register R2 by ISA definition) or a variably defined register selected based on a register field of the [BX]AUT instruction. In the program code example 3 below, the general purpose register is R2. Hence, for the [BX]AUT instruction, the operation corresponding to the [BX]AUT instruction does not require the second modifier value to be program-counter-dependent (there is nothing in the hardware circuitry that will do anything to ensure for the [BX]AUT instruction that the value in the referenced general purpose register has previously been set based on a program counter value).

While in the expected software use case, it is expected that the developer or compiler will have preceded the [BX]AUT instruction with one or more other instructions that cause the value in the general purpose register to correspond with the PC value used as second modifier value by an earlier PAC[BTI] instruction, this is not enforced in hardware.

It may be extremely counter-intuitive that a correction to the stack pointer, while supporting a general purpose register used for the second modifier in the [BX]AUT instruction, would be helpful for addressing the problems of limited immediate range for representing the offset between the PAC[BTI]/[BX]AUT instructions and the lack of available registers around the AUT instruction.

However, as indicated in Example 3 below, by correcting the SP for the AUT instruction, this enables the PAC[BTI] and [BX]AUT instructions to execute at different stack depths, eliminating the need for a stack push instruction (STR instruction lb from Example 2) to execute before the PAC instruction, and hence enabling use of the combined PACBTI instruction to improve program code density: Program code example 3 label: 1: pacbti r12, lr, sp, pc 2: push {r0, lr} 3:[...] 4: pop frO, lr} lb: str r2, [sp, #-4]! 5a: adr r2, <label> 5b: aut r12, lr, r2, [sp, #-4] 5c: ldr r2, [sp], #4 5d: bx 1r In this example, instructions 1, 2, 3, 4 are the same as in Example 1 and instructions 5a-5d are the same as in Example 2. Instruction lb, which was needed before the PAC instruction lc for preserving the contents of register R2 in Example 2, can be moved down to be executed after the PAC instruction, because it is no longer necessary for the AUT instruction at 5b to execute at the same stack depth (with same current value of the stack pointer SP) as the PACBTI instruction at 1. By moving instruction 1b after the PAC instruction at 1, the branch entry point and PAC instruction no longer need to be different instructions and so the combined PACBTI instruction can be used, improving program code density.

Hence, the ISA support for an AUT instruction which applies a correction to the stack pointer to generate the first modifier value helps improve code density relative to Example 2, since one instruction in total is eliminated (excluding the instructions for the function loop body at 3 which will be the same in both cases, the number of function call/return wrapper instructions is 9 in Example 2 and 8 in Example 3). While saving one instruction may not seem like much, the frequency of function call/return events is so high that this can amount to a significant saving in memory storage overhead, instruction fetch/decode events, and executed micro-operations across an overall software workload, and so this can help save power and improve processor performance.

Figure 8 illustrates another example of variants of the PAC[BTI] and [BX]AUT instructions that could be used, which again make use of a correction value to apply a correction to deal with the problem of limited immediate range and limited register availability as discussed above. In this example, a correction is applied in the generation of the second modifier value for both the PAC[BTI] and [BX]AUT instructions. For both PAC[BTI] and [BX]AUT instructions, the first modifier value is the current stack pointer from SP register, without correction.

For the PAC[BTI] instruction, the second modifier value is program-counter-dependent, and is obtained by applying a correction to the program counter address in the PC register, based on a correction value specified either using an immediate value or using a general purpose register (again the general purpose register could be fixed or could be variably selected depending on a register field of the PAC[BTI] instruction).

For the [BX]AUT instruction, the second modifier value is program-counter-dependent, and is obtained by applying a correction to the current program counter address specified by the PC register for the point of program flow corresponding to the [BX]AUT instruction, based on a correction value specified as an immediate value.

Hence, in the example of Figure 8, both the PAC[BTI] instruction and the [BX]AUT instruction apply corrections to their respective values of the PC to obtain corresponding second modifier values.

As shown in program code example 4 below, this enables a further improvement in code density, enabling the number of instructions for the function call/return wrapper to return to the number shown in example 1, even when the PACBTI and BXAUT instructions are separated by too great a range to be referenced using the immediate of the BXAUT instruction alone. This works because, by designing the PACBTI and BXAUT in the ISA so that they both apply a correction to the PC, the value used as the second modifier value is an intermediate address part way through the function call, rather than being forced to be the value of the PC corresponding to the PACBTI instruction: Program code example 4: 1: pacbti r12, 1r, sp, pc, li<correctionfactor> 2: push frO, lr} 3a: [...] intermediate: 3b: [...] 4: pop frO, lr} 5: bxaut r12, 1r, sp, <intermediate> (here, instructions 3a and 3b are the same as function body 3 in the earlier examples -there is no increase in the number of instructions in the function body, but the function body is split to indicate that the address referenced using the respective corrections applied to the PC by the PACBTI and BXAUT instructions is an intermediate address part way through the function code).

As well as increasing the maximum range by which the PACBTI and BXAUT instructions can be separated, this approach to defining the instructions supported by the ISA also has a security benefit for use cases where the function code is compiled by just-in-time (JIT) compilation, so that the instructions defined according to the ISA are compiled on the fly based on high level program code at the time that the program is being executed, rather than being compiled statically in advance and then stored as assembly code for later execution. If JIT compilation is used, there is flexibility to compile the same high level function into different assembly code sequences for different instances of calling the same function. When the approach shown in Figure 8 is used, the use of corrections at both PACBTI and BXAUT instructions means that there is flexibility to adjust which address is referenced as the "intermediate" address from one instance of compiling the function to another (the compiler can change the <correction factor> of the PACBTI and immediate value of the BXAUT instruction to change which intermediate address is used as the second modifier). This means that, even when calling the same function at the same stack depth, different authentication code values can be generated for different instances of that function as the second modifiers are different for each instance, providing further protection against attackers attempting to reuse old authentication code values on a later function call.

Figure 9 shows another example, in which the [BX]AUT instruction is the same as in Figure 8, but rather than the PAC[BTI] instruction applying a correction to the PC to generate the second modifier value, the second modifier value for the PAC[BTI] instruction is specified in a general purpose register. As shown in example 5 below, while this may require a few additional instructions compared to example 4, it can still reduce the total number of instructions compared to example 2 while supporting increased address range between the PACBTI and BXAUT instructions: Program code example 5: la: bti lb': adr r12, <correction factor> lc: pac r12, 1r, sp, r12 2: push frO, lr} 3a: [...] intermediate: 3b: [...] 4: pop frO, lr} 5: bxaut r12, 1r, sp, <intermediate> In this example, instructions 2-5 are the same as in Example 4, but the PACBTI instruction 1 in Example 4 is replaced with separate BTI and PAC instructions la and lc as in Example 2 with an intervening address generation instruction ADR lb' to a place a value derived by applying a correction factor to the PC to generate a value corresponding to the intermediate address referenced by the BXAUT instruction to be placed in general purpose register r12 for use as second modifier value for the PAC instruction. While this option may slightly increase code density relative to example 4, it can simplify hardware implementation by simplifying the PAC operation, and may help support other use cases where the ability to support an arbitrary modifier value using a general purpose register could be helpful.

Hence, in summary, various examples are discussed above, with different options for defining the first and second modifier values for the PAC[BTI] and [BX]AUT instructions as follows: 15' modifier for 2" modifier for 1' modifier for 2"d modifier for PAC[BTI] PAC[BTI] [BX]AUT [BX]AUT Example 1 SP PC SP PC, corrected based on immediate Figure 7, Example 3 SP PC SP, corrected based on immediate General purpose register Figure 8, Example 4 SP PC, corrected based on immediate or general purpose register SP PC, corrected based on immediate Figure 9, Example 5 SP General purpose register SP PC, corrected based on immediate Concepts described herein may be embodied in computer-readable code for fabrication of an apparatus that embodies the described concepts. For example, the computer-readable code can be used at one or more stages of a semiconductor design and fabrication process, including an electronic design automation (EDA) stage, to fabricate an integrated circuit comprising the apparatus embodying the concepts. The above computer-readable code may additionally or alternatively enable the definition, modelling, simulation, verification and/or testing of an apparatus embodying the concepts described herein.

For example, the computer-readable code for fabrication of an apparatus embodying the concepts described herein can be embodied in code defining a hardware description language (HDL) representation of the concepts. For example, the code may define a register-transfer-level (RTL) abstraction of one or more logic circuits for defining an apparatus embodying the concepts.

The code may define a HDL representation of the one or more logic circuits embodying the apparatus in Verilog, SystemVerilog, Chisel, or VHDL (Very High-Speed Integrated Circuit Hardware Description Language) as well as intermediate representations such as FIRRTL. Computer-readable code may provide definitions embodying the concept using system-level modelling languages such as SystemC and SystemVerilog or other behavioural representations of the concepts that can be interpreted by a computer to enable simulation, functional and/or formal verification, and testing of the concepts.

Additionally or alternatively, the computer-readable code may define a low-level description of integrated circuit components that embody concepts described herein, such as one or more netlists or integrated circuit layout definitions, including representations such as GOSH.

The one or more netlists or other computer-readable representation of integrated circuit components may be generated by applying one or more logic synthesis processes to an RTL representation to generate definitions for use in fabrication of an apparatus embodying the invention. Alternatively or additionally, the one or more logic synthesis processes can generate from the computer-readable code a bitstream to be loaded into a field programmable gate array (FPGA) to configure the FPGA to embody the described concepts. The FPGA may be deployed for the purposes of verification and test of the concepts prior to fabrication in an integrated circuit or the FPGA may be deployed in a product directly.

The computer-readable code may comprise a mix of code representations for fabrication of an apparatus, for example including a mix of one or more of an RTL representation, a netlist representation, or another computer-readable definition to be used in a semiconductor design and fabrication process to fabricate an apparatus embodying the invention. Alternatively or additionally, the concept may be defined in a combination of a computer-readable definition to be used in a semiconductor design and fabrication process to fabricate an apparatus and computer-readable code defining instructions which are to be executed by the defined apparatus once fabricated.

Such computer-readable code can be disposed in any known transitory computer-readable medium (such as wired or wireless transmission of code over a network) or non-transitory computer-readable medium such as semiconductor, magnetic disk, or optical disc. An integrated circuit fabricated using the computer-readable code may comprise components such as one or more of a central processing unit, graphics processing unit, neural processing unit, digital signal processor or other components that individually or collectively embody the concept. Figure 10 illustrates a simulator implementation that may be used. Whilst the earlier described embodiments implement the present invention in terms of apparatus and methods for operating specific processing hardware supporting the techniques concerned, it is also possible to provide an instruction execution environment in accordance with the embodiments described herein which is implemented through the use of a computer program. Such computer programs are often referred to as simulators, insofar as they provide a software based implementation of a hardware architecture. Varieties of simulator computer programs include emulators, virtual machines, models, and binary translators, including dynamic binary translators. Typically, a simulator implementation may run on a host processor 730, optionally running a host operating system 720, supporting the simulator program 710. In some arrangements, there may be multiple layers of simulation between the hardware and the provided instruction execution environment, and/or multiple distinct instruction execution environments provided on the same host processor. Historically, powerful processors have been required to provide simulator implementations which execute at a reasonable speed, but such an approach may be justified in certain circumstances, such as when there is a desire to run code native to another processor for compatibility or re-use reasons. For example, the simulator implementation may provide an instruction execution environment with additional functionality which is not supported by the host processor hardware, or provide an instruction execution environment typically associated with a different hardware architecture. An overview of simulation is given in "Some Efficient Architecture Simulation Techniques", Robert Bedichek, Winter 1990 USENIX Conference, Pages 53 -63.

To the extent that embodiments have previously been described with reference to particular hardware constructs or features, in a simulated embodiment, equivalent functionality may be provided by suitable software constructs or features. For example, particular circuitry may be implemented in a simulated embodiment as computer program logic. Similarly, memory hardware, such as a register or cache, may be implemented in a simulated embodiment as a software data structure. In arrangements where one or more of the hardware elements referenced in the previously described embodiments are present on the host hardware (for example, host processor 730), some simulated embodiments may make use of the host hardware, where suitable.

The simulator program 710 may be stored on a computer-readable storage medium (which may be a non-transitory medium), and provides a program interface (instruction execution environment) to the target code 700 (which may include applications, operating systems and a hypervisor) which is the same as the interface of the hardware architecture being modelled by the simulator program 710. Thus, the program instructions of the target code 700 described above, including the authentication code generating class of instructions described earlier, may be executed from within the instruction execution environment using the simulator program 710, so that a host computer 730 which does not actually have the hardware features of the apparatus 2 discussed above can emulate these features.

The simulation program 710 may include instruction decoding program logic 712 for decoding instructions of the target code 700 to map them to corresponding sequences of instructions defined according to the native instruction set architecture supported by the host hardware 730. Hence, the instruction decoding program logic 712 may include logic for generating sequences of host instructions corresponding to the PAC/AUT instructions described earlier. Register simulating program logic 714 of the simulator program 710 may simulate access to registers by the target code 700, e.g. by maintaining a data structure in registers or memory of the host hardware 730 which emulates the set of architectural registers 14 expected to be provided in the ISA associated with the target code 700. For example, register references for operands and results of the PAC/AUT instructions mentioned earlier can be remapped by the register simulating program logic 714 to access requests which reference particular locations in host storage provided by host hardware 730.

In the present application, the words "configured to..." are used to mean that an element of an apparatus has a configuration able to carry out the defined operation. In this context, a "configuration" means an arrangement or manner of interconnection of hardware or software. For example, the apparatus may have dedicated hardware which provides the defined operation, or a processor or other processing device may be programmed to perform the function. "Configured to" does not imply that the apparatus element needs to be changed in any way in order to provide the defined operation.

In the present application, lists of features preceded with the phrase "at least one of mean that any one or more of those features can be provided either individually or in combination. For example, "at least one of: [A], [B] and [C]" encompasses any of the following options: A alone (without B or C), B alone (without A or C), C alone (without A or B), A and B in combination (without C), A and C in combination (without B), B and C in combination (without A), or A, B and C in combination.

Although illustrative embodiments of the invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope of the invention as defined by the appended claims.

Claims

CLAIMS1. An apparatus comprising: instruction decoding circuitry to decode instructions; and processing circuitry to perform a processing operation in response to a decoded instruction decoded by the instruction decoding circuitry; in which: in response to an authentication code generating class of instruction, the processing circuitry is configured to generate an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; the authentication code generating class of instruction including: an authentication code setting instruction, for which the processing circuitry is configured to write the generated authentication code to a destination register separate from a register providing the authentication target value for the authentication code setting instruction; and an authentication code checking instruction, for which the processing circuitry is configured to check whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register separate from a register providing the authentication target value for the authentication code checking instruction, and trigger an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a program counter register.
2. The apparatus according to claim 1, in which the authentication code checking instruction specifies an immediate value, and one of the first modifier value and the second modifier value for the authentication code setting instruction depends on the immediate value applied as an adjustment to the stack pointer or the program counter address.
3. The apparatus according to any of claims 1 and 2, in which one of the authentication code setting instruction and the authentication code checking instruction specifies a correction value for applying a correction to one of the stack pointer and the program counter address to generate a corresponding one of the first modifier and the second modifier.
4. The apparatus according to claim 3, in which the correction based on the correction value enables the authentication code setting instruction and the authentication code checking instruction to be separated by an instruction address offset greater than a maximum instruction address offset representable using an immediate value of the authentication code checking instruction.
5. The apparatus according to any of claims 3 and 4, in which the authentication code checking instruction specifies the correction value as a correction to be applied to the stack pointer to generate the first modifier value for the authentication code checking instruction.
6. The apparatus according to any of claims 3 to 5, in which the authentication code setting instruction specifies the correction value as a correction to be applied to the program counter address to generate the second modifier value for the authentication code setting instruction.
7. The apparatus according to claim 6, in which the authentication code checking instruction also specifies an immediate value as a correction to be applied to the program counter address to generate the second modifier value for the authentication code checking instruction.
8. The apparatus according to any of claims 3 to 7, in which the correction value is specified as an immediate value by said one of the authentication code setting instruction and the authentication code checking instruction.
9. The apparatus according to any of claims 3 to 7, in which the correction value is specified as a value obtained from a general purpose register.
10. The apparatus according to claim 9, in which said one of the authentication code setting instruction and the authentication code checking instruction specifies a register field identifying the general purpose register providing the correction value.
11. The apparatus according to any of claims 3 to 10, in which the correction comprises an arithmetic correction to add or subtract the correction value from said one of the stack pointer and the program counter address.
12. The apparatus according to any of claims 3 to 10, in which the correction comprises a logical correction to generate a corrected value based on a logical combination of the correction value and said one of the stack pointer and the program counter address.
13. The apparatus according to any preceding claim, in which: in response to a non-sequential change of program flow to a branch target address being triggered when a branch target check is enabled, the processing circuitry is configured to trigger an error handling response when the instruction at the branch target address is not a member of a permitted class of branch target instructions; and the authentication code setting instruction is one of said permitted class of branch target instructions.
14. The apparatus according to any preceding claim, in which: in response to the authentication code checking instruction when the generated authentication code is detected as corresponding to the reference authentication code, the processing circuitry is configured to trigger a branch to an address specified by an operand of the authentication code checking instruction.
15. The apparatus according to any preceding claim, in which: for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises the program counter address; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction.
16. The apparatus according to any preceding claim, in which: for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises the program counter address; and for at least one variant of the authentication code checking instruction, the first modifier comprises a result of correcting the stack pointer based on a correction value specified by the authentication code checking instruction, and the second modifier comprises a value obtained from a general purpose register.
17. The apparatus according to any preceding claim, in which: for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on a correction value specified by the authentication code setting instruction; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction.
18. The apparatus according to claim 17, in which the correction value is specified as an immediate value by the authentication code setting instruction.
19. The apparatus according to claim 17, in which the correction value for the authentication code setting instruction is specified in a general purpose register.
20. The apparatus according to any preceding claim, in which: for at least one variant of the authentication code setting instruction, the first modifier comprises the stack pointer and the second modifier comprises a value specified in a general purpose register; and for at least one variant of the authentication code checking instruction, the first modifier comprises the stack pointer and the second modifier comprises a result of correcting the program counter address based on an immediate value specified by the authentication code checking instruction.
21. Computer-readable code for fabrication of the apparatus of any preceding claim.
22. A method comprising: performing a processing operation in response to a decoded instruction; in which: in response to the decoded instruction being an authentication code generating class of instruction, the processing operation comprises generating an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; in response to the decoded instruction being an authentication code setting instruction of said authentication code generating class of instruction, the processing operation comprises writing the generated authentication code to a destination register separate from a register providing the authentication target value for the authentication code setting instruction; and in response to the decoded instruction being an authentication code checking instruction of said authentication code generating class of instruction, the processing operation comprises checking whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a source register separate from a register providing the authentication target value for the authentication code checking instruction, and triggering an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a program counter register.
23. A computer program for controlling a host data processing apparatus to provide an instruction execution environment for execution of target program code, the computer program comprising: instruction decoding program logic to decode an instruction of the target program code and control the host data processing apparatus to perform a processing operation in response to the decoded instruction; in which: in response to an authentication code generating class of instruction, the instruction decoding program logic is configured to control the host data processing apparatus to generate an authentication code associated with an authentication target value by applying a code generating function to the authentication target value, a key, a first modifier value and a second modifier value; the authentication code generating class of instruction including: an authentication code setting instruction, for which the instruction decoding program logic is configured to control the host data processing apparatus to write the generated authentication code to a simulated destination register separate from a simulated register providing the authentication target value for the authentication code setting instruction; and an authentication code checking instruction, for which the instruction decoding program logic is configured to control the host data processing apparatus to check whether the authentication code generated based on the authentication target value in response to the authentication code checking instruction corresponds to a reference authentication code obtained from a simulated source register separate from a simulated register providing the authentication target value for the authentication code checking instruction, and trigger an error response when the generated authentication code does not correspond to the reference authentication code; wherein: the first modifier value comprises a value dependent on a stack pointer obtained from a simulated stack pointer register; and for at least one of the authentication code setting instruction and the authentication code checking instruction, the second modifier value comprises a value dependent on a program counter address obtained from a simulated program counter register.
24. A storage medium storing the computer-readable code of claim 22 or the computer program of claim 23.