From 45387a8faac9f7e52050d3a549ef5f1e23de5b9e Mon Sep 17 00:00:00 2001 From: Pascal Gouedo Date: Mon, 27 Nov 2023 17:32:55 +0100 Subject: [PATCH 1/4] Adjusted FPU DIV/SQRT latency to new T-Head unit. Signed-off-by: Pascal Gouedo --- docs/source/pipeline.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/pipeline.rst b/docs/source/pipeline.rst index 516daa2e4..971a0a9e1 100644 --- a/docs/source/pipeline.rst +++ b/docs/source/pipeline.rst @@ -150,7 +150,7 @@ The cycle counts assume zero stall on the instruction-side interface and zero st | Comparison, Conversion | | If there are enough instructions between FPU one and | | or Classify | | the instruction using the result then cycle number is 1. | +------------------------+--------------------------------------+ "Enough instruction" number is either FPU_ADDMUL_LAT, | - | Single Precision | 1..12 | FPU_OTHERS_LAT or 11. | + | Single Precision | 1..19 | FPU_OTHERS_LAT or 11. | | Floating-Point | | If there are no instruction in between then cycle number is | | Division and | | the maximum value for each category. | | Square-Root | | | From b838f1648b7e1e0a8554d6c0ef03b46bbae33dbc Mon Sep 17 00:00:00 2001 From: Pascal Gouedo Date: Mon, 27 Nov 2023 17:33:29 +0100 Subject: [PATCH 2/4] Added paragraphs and notes about hardware loops impact on application, exceptions handlers and debugger. Signed-off-by: Pascal Gouedo --- docs/source/corev_hw_loop.rst | 43 +++++++++++++++++++++++++++++++++++ docs/source/debug.rst | 10 ++++++++ 2 files changed, 53 insertions(+) diff --git a/docs/source/corev_hw_loop.rst b/docs/source/corev_hw_loop.rst index 7b83ec09f..7170baf50 100644 --- a/docs/source/corev_hw_loop.rst +++ b/docs/source/corev_hw_loop.rst @@ -138,3 +138,46 @@ it is executed 10x10 times. Whereas the outermost loop, from startO to (endO - 4 executes 10 times the innermost loop and adds 2 to the register %[j]. At the end of the loop, the register %[i] contains 300 and the register %[j] contains 20. +.. _hwloop-exceptions_handlers: + +Hardware loops impact on application, exceptions handlers and debugger +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Application and ebreak/ecall exception handlers +----------------------------------------------- + +When an ebreak or an ecall instruction is used in an application, special care should be given for those instruction handlers in case they are placed as the last instruction of an HWLoop. +Those handlers should manage MEPC and lpcountX CSRs updates because an hw loop early-exit could happen if not done. + +At the end of the handlers after restoring the context/CSRs, a piece of smart code should be added (by order of piority): + +1. if MEPC is equal to "lpend0 - 4", then MPEC should be set to lpstart0 and lpcount0 should be decremented by 1 if strictly higher than 0, +2. if MEPC is equal to "lpend1 - 4", then MPEC should be set to lpstart1 and lpcount1 should be decremented by 1 if strictly higher than 0, +3. if (lpstart0 <= MEPC < lpend0 - 4) or (lpstart1 <= MEPC < lpend1 - 4), then MPEC should be incremented by 4, +4. if instruction at MEPC location is either ecall or ebreak, MPEC should be incremented by 4, +5. if instruction at MEPC location location is c.ebreak, MPEC should be incremented by 2. + +The 2 last cases are the standard ones when ebreak/ecall are not inside an HWLopp. + +Interrupt handlers +------------------ + +When an interrupt is happening on the last HWLoop instruction, its execution is cancelled, its address is saved in MEPC and its execution will be resumed when returning from interrupt handler. +There is nothing special to be done in those interrupt handlers with respect to MEPC and lpcountX updates, they will be correctly managed by design when executing this last HWLoop instruction after interrupt handler execution. + +Illegal instruction exception handler +------------------------------------- + +Depending if an application is going to resume or not after Illegal instruction exception handler, same MEPC/HWLoops CSRs management than ebreak/ecall could be necessary. + +Debugger +-------- + +If ebreak is used to enter in Debug Mode (:ref:`ebreak_scenario_2`) and put at the last instruction location of an HWLoop (not very likely to happen), same management than above should be done but on DPC rather than on MEPC. + +When ebreak instruction is used as Software Breakpoint by a debugger when in debug mode and is placed at the last instruction location of an HWLoop in instruction memory, no special management is foreseen. +When executing the Software Breakpoint/ebreak instruction, control is given back to the debugger which will manage the different cases. +For instance in Single-Step case, original instruction is put back in instruction memory, a Single-Step command is executed on this last instruction (with desgin updating PC and lpcountX to correct values) and Software Breakpoint/ebreak is put back by the debugger in memory. + +When ecall instruction is used by a debugger to execute System Calls and is placed at the last instruction location of an HWLoop in instruction memory, debugger ecall handler in debug rom should do the same than described above for application case. + diff --git a/docs/source/debug.rst b/docs/source/debug.rst index 654183912..b18da9699 100644 --- a/docs/source/debug.rst +++ b/docs/source/debug.rst @@ -162,6 +162,8 @@ The EBREAK instruction description is distributed across several RISC-V specific `RISC-V Priveleged Specification `_, `RISC-V ISA `_. The following is a summary of the behavior for three common scenarios. +.. _ebreak_scenario_1: + Scenario 1 : Enter Exception """""""""""""""""""""""""""" @@ -173,10 +175,14 @@ Executing the EBREAK instruction when the core is **not** in Debug Mode and the To properly return from the exception, the ebreak handler will need to increment the MEPC to the next instruction. This requires querying the size of the ebreak instruction that was used to enter the exception (16 bit c.ebreak or 32 bit ebreak). +As mentioned in :ref:`hwloop-exceptions_handlers`, some additional cases exist for MEPC update when ebreak is the last instruction of an Hardware Loop. + .. note:: The CV32E40P does not support MTVAL CSR register which would have saved the value of the instruction for exceptions. This may be supported on a future core. +.. _ebreak_scenario_2: + Scenario 2 : Enter Debug Mode """"""""""""""""""""""""""""" @@ -187,11 +193,15 @@ Executing the EBREAK instruction when the core is **not** in Debug Mode and the Similar to the exception scenario above, the debugger will need to increment the DPC to the next instruction before returning from Debug Mode. +There is no forseseen situtation where it would be needed to enter in Debug Mode only on the last instruction of an Hardware Loop but just in case this is mentioned in :ref:`hwloop-exceptions_handlers` as well. + .. note:: The default value of DCSR.EBREAKM is 0 and the DCSR is only accessible in Debug Mode. To enter Debug Mode from EBREAK, the user will first need to enter Debug Mode through some other means, such as from the external ``debug_req_i``, and set DCSR.EBREAKM. +.. _ebreak_scenario_3: + Scenario 3 : Exit Program Buffer & Restart Debug Code """"""""""""""""""""""""""""""""""""""""""""""""""""" From ad56cba59a69bc3327dac2d9aa3aca019a85da62 Mon Sep 17 00:00:00 2001 From: Pascal Gouedo Date: Tue, 28 Nov 2023 12:07:33 +0100 Subject: [PATCH 3/4] Clearer description of handlers management of MEPC/lpcountX. Signed-off-by: Pascal Gouedo --- docs/source/corev_hw_loop.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/source/corev_hw_loop.rst b/docs/source/corev_hw_loop.rst index 7170baf50..81b6ad9c2 100644 --- a/docs/source/corev_hw_loop.rst +++ b/docs/source/corev_hw_loop.rst @@ -151,11 +151,11 @@ Those handlers should manage MEPC and lpcountX CSRs updates because an hw loop e At the end of the handlers after restoring the context/CSRs, a piece of smart code should be added (by order of piority): -1. if MEPC is equal to "lpend0 - 4", then MPEC should be set to lpstart0 and lpcount0 should be decremented by 1 if strictly higher than 0, -2. if MEPC is equal to "lpend1 - 4", then MPEC should be set to lpstart1 and lpcount1 should be decremented by 1 if strictly higher than 0, -3. if (lpstart0 <= MEPC < lpend0 - 4) or (lpstart1 <= MEPC < lpend1 - 4), then MPEC should be incremented by 4, -4. if instruction at MEPC location is either ecall or ebreak, MPEC should be incremented by 4, -5. if instruction at MEPC location location is c.ebreak, MPEC should be incremented by 2. +1. if MEPC = "lpend0 - 4" and lpcount0 >= 2 then MPEC should be set to lpstart0; if MEPC = "lpend0 - 4" and lpcount0 >= 1 then it should be decremented by 1. +2. if MEPC = "lpend1 - 4" and lpcount1 >= 2 then MPEC should be set to lpstart1; if MEPC = "lpend1 - 4" and lpcount1 >= 1 then it should be decremented by 1. +3. if (lpstart0 <= MEPC < lpend0 - 4) or (lpstart1 <= MEPC < lpend1 - 4) then MPEC should be incremented by 4. +4. if instruction at MEPC location is either ecall or ebreak then MPEC should be incremented by 4. +5. if instruction at MEPC location location is c.ebreak then MPEC should be incremented by 2. The 2 last cases are the standard ones when ebreak/ecall are not inside an HWLopp. From a9d341fedc8b4a6b55b9a2a1e2a831161412ac9f Mon Sep 17 00:00:00 2001 From: Pascal Gouedo Date: Tue, 28 Nov 2023 15:05:20 +0100 Subject: [PATCH 4/4] Even clearer description of handlers. Signed-off-by: Pascal Gouedo --- docs/source/corev_hw_loop.rst | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/source/corev_hw_loop.rst b/docs/source/corev_hw_loop.rst index 81b6ad9c2..6d0cbf64d 100644 --- a/docs/source/corev_hw_loop.rst +++ b/docs/source/corev_hw_loop.rst @@ -90,12 +90,9 @@ is that it greatly simplifies compiler optimization (relative to basic blocks ma In order to use hardware loops, the compiler needs to setup the loops beforehand with cv.start/i, cv.end/i, cv.count/i or cv.setup/i instructions. The compiler will use HWLoop automatically whenever possible without the need of assembly. -For debugging and context switches, the hardware loop registers are mapped into the CSR custom read-only address space. +For debugging, interrupts and context switches, the hardware loop registers are mapped into the CSR custom read-only address space. To read them csrr instructions should be used and to write them register flavour of hardware loop instructions should be used. Using csrw instructions to write hardware loop registers will generate an illegal instruction exception. - -Since hardware loop feature could be used in interrupt routine/handler, the registers have -to be saved (resp. restored) at the beginning (resp. end) of the interrupt routine together with the general purpose registers. The CSR HWLoop registers are described in the :ref:`cs-registers` section. Below an assembly code example of a nested HWLoop that computes a matrix addition. @@ -149,13 +146,15 @@ Application and ebreak/ecall exception handlers When an ebreak or an ecall instruction is used in an application, special care should be given for those instruction handlers in case they are placed as the last instruction of an HWLoop. Those handlers should manage MEPC and lpcountX CSRs updates because an hw loop early-exit could happen if not done. -At the end of the handlers after restoring the context/CSRs, a piece of smart code should be added (by order of piority): +At the end of the handlers after restoring the context/CSRs, a piece of smart code should be added with following highest to lowest order of priority: -1. if MEPC = "lpend0 - 4" and lpcount0 >= 2 then MPEC should be set to lpstart0; if MEPC = "lpend0 - 4" and lpcount0 >= 1 then it should be decremented by 1. -2. if MEPC = "lpend1 - 4" and lpcount1 >= 2 then MPEC should be set to lpstart1; if MEPC = "lpend1 - 4" and lpcount1 >= 1 then it should be decremented by 1. -3. if (lpstart0 <= MEPC < lpend0 - 4) or (lpstart1 <= MEPC < lpend1 - 4) then MPEC should be incremented by 4. -4. if instruction at MEPC location is either ecall or ebreak then MPEC should be incremented by 4. -5. if instruction at MEPC location location is c.ebreak then MPEC should be incremented by 2. +1. if MEPC = lpend0 - 4 and lpcount0 > 1 then MPEC should be set to lpstart0 and lpcount0 should be decremented by 1, +2. else if MEPC = lpend0 - 4 and lpcount0 = 1 then MPEC should be incremented by 4 and lpcount0 should be decremented by 1, +3. else if MEPC = lpend1 - 4 and lpcount1 > 1 then MPEC should be set to lpstart1 and lpcount1 should be decremented by 1, +4. else if MEPC = lpend1 - 4 and lpcount1 = 1 then MPEC should be incremented by 4 and lpcount1 should be decremented by 1, +5. else if (lpstart0 <= MEPC < lpend0 - 4) or (lpstart1 <= MEPC < lpend1 - 4) then MPEC should be incremented by 4, +6. else if instruction at MEPC location is either ecall or ebreak then MPEC should be incremented by 4, +7. else if instruction at MEPC location location is c.ebreak then MPEC should be incremented by 2. The 2 last cases are the standard ones when ebreak/ecall are not inside an HWLopp. @@ -165,6 +164,8 @@ Interrupt handlers When an interrupt is happening on the last HWLoop instruction, its execution is cancelled, its address is saved in MEPC and its execution will be resumed when returning from interrupt handler. There is nothing special to be done in those interrupt handlers with respect to MEPC and lpcountX updates, they will be correctly managed by design when executing this last HWLoop instruction after interrupt handler execution. +Moreover since hardware loop could be used in interrupt routine, the registers have to be saved (resp. restored) at the beginning (resp. end) of the interrupt routine together with the general purpose registers. + Illegal instruction exception handler -------------------------------------