Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic PR dev->master #891

Merged
merged 2 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions docs/source/instruction_set_extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1349,9 +1349,9 @@ SIMD ALU operations
+------------------------------------------------------------+------------------------------------------------------------------+
| **Mnemonic** | **Description** |
+============================================================+==================================================================+
| **cv.add[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6]** | rD[i] = (rs1[i] + op2[i]) & 0xFFFF |
| **cv.add[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6]** | rD[i] = (rs1[i] + op2[i]) & {0xFFFF, 0xFF} |
+------------------------------------------------------------+------------------------------------------------------------------+
| **cv.sub[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6]** | rD[i] = (rs1[i] - op2[i]) & 0xFFFF |
| **cv.sub[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6]** | rD[i] = (rs1[i] - op2[i]) & {0xFFFF, 0xFF} |
+------------------------------------------------------------+------------------------------------------------------------------+
| **cv.avg[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6]** | rD[i] = ((rs1[i] + op2[i]) & {0xFFFF, 0xFF}) >> 1 |
| | |
Expand Down Expand Up @@ -2146,11 +2146,11 @@ No carry, overflow is generated. Instructions are rounded up as the mask & 0xFFF
| | Note: Arithmetic shift right. |
+---------------------------------------+---------------------------------------------------------------------------------------+

SIMD Complex-numbers Encoding
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SIMD Complex-number Encoding
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. table:: SIMD ALU encoding
:name: SIMD ALU encoding
.. table:: SIMD Complex-number encoding
:name: SIMD Complex-number encoding
:widths: 11 4 4 9 7 8 8 13 36
:class: no-scrollbar-table

Expand Down
19 changes: 15 additions & 4 deletions docs/source/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,25 @@ Pipeline Details
CV32E40P has a 4-stage in-order completion pipeline, the 4 stages are:

Instruction Fetch (IF)
Fetches instructions from memory via an aligning prefetch buffer, capable of fetching 1 instruction per cycle if the instruction side memory system allows. This prefetech buffer is able to store 2 32-b data. The IF stage also pre-decodes RVC instructions into RV32I base instructions. See :ref:`instruction-fetch` for details.
Fetches instructions from memory via an aligning prefetch buffer, capable of fetching 1 instruction per cycle if the instruction side memory system allows. This prefetch buffer is able to store 2 32-b data.
The IF stage also pre-decodes RVC instructions into RV32I base instructions. See :ref:`instruction-fetch` for details.

Instruction Decode (ID)
Decodes fetched instruction and performs required register file reads. Jumps are taken from the ID stage.

Execute (EX)
Executes the instructions. The EX stage contains the ALU, Multiplier and Divider. Branches (with their condition met) are taken from the EX stage. Multi-cycle instructions will stall this stage until they are complete. The ALU, Multiplier and Divider instructions write back their result to the register file from the EX stage. The address generation part of the load-store-unit (LSU) is contained in EX as well.
Executes the instructions. The EX stage contains the ALU, Multiplier and Divider. Branches (with their condition met) are taken from the EX stage. Multi-cycle instructions will stall this stage until they are complete.
The ALU, Multiplier and Divider instructions write back their result to the register file from the EX stage. The address generation part of the load-store-unit (LSU) is contained in EX as well.

The FPU writes back its result from EX stage as well when FPU_*_LAT is either 0 cycle or more than 1 cycle. It is reusing register file ALU/Mult/Div write port and it has the highest priority so it will stall EX stage if there is a conflict (when FPU_*_LAT > 1).
The FPU writes back its result at EX stage as well through this ALU/Mult/Div register file write port when FPU_*_LAT is either 0 cycle or greater than 1 cycle.
When FPU_*_LAT > 1, FPU write-back has the highest priority so it will stall EX stage if there is a conflict. There are few exceptions to this FPU priority over ALU/Mult/Div.

They are:

* There is a multi-cycle MULH in EX.
* There is a Misaligned LOAD/STORE in EX.
* There is a Post-Increment LOAD/STORE in EX.
In those 3 exceptions, EX will not be stalled, FPU result (and flags) are memorized and will be written back in the register file (and FPU CSR) as soon as there is no conflict anymore.

Writeback (WB)
Writes the result of Load instructions back to the register file.
Expand Down Expand Up @@ -68,7 +78,8 @@ Those cycles penalty can be hidden if the compiler is able to add instructions b
Single- and Multi-Cycle Instructions
------------------------------------

:numref:`Cycle counts per instruction type` shows the cycle count per instruction type. Some instructions have a variable time, this is indicated as a range e.g. 1..32 means that the instruction takes a minimum of 1 cycle and a maximum of 32 cycles. The cycle counts assume zero stall on the instruction-side interface and zero stall on the data-side memory interface.
:numref:`Cycle counts per instruction type` shows the cycle count per instruction type. Some instructions have a variable time, this is indicated as a range e.g. 1..32 means that the instruction takes a minimum of 1 cycle and a maximum of 32 cycles.
The cycle counts assume zero stall on the instruction-side interface and zero stall on the data-side memory interface.

.. _instructions_latency_table:
.. table:: Cycle counts per instruction type
Expand Down
1 change: 1 addition & 0 deletions rtl/cv32e40p_ex_stage.sv
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,7 @@ module cv32e40p_ex_stage
assign apu_read_dep_for_jalr_o = 1'b0;
assign apu_write_dep_o = 1'b0;
assign fpu_fflags_o = '0;
assign fpu_fflags_we_o = '0;
end
endgenerate

Expand Down
2 changes: 2 additions & 0 deletions scripts/lec/synopsys_formality/check_lec.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ set_dont_verify_point -type port i:WORK/cv32e40p_core/apu_flags_o*

verify > ./reports/verify.rpt

report_aborted_points > ./reports/aborted_points.rpt
report_failing_points > ./reports/failing_points.rpt
analyze_points -failing > ./reports/analyze.rpt

exit
Loading