Interface Reference
This reference documents the hardware and software interfaces shared by all Dyber PQC IP cores. Every core follows a consistent register map layout, command protocol, and programming model — enabling a single driver framework to operate any core in the portfolio.
Overview #
All Dyber IP cores present a unified programming interface regardless of the underlying algorithm or bus wrapper. The interface is organized into three regions: a control region for command dispatch and configuration, a status region for completion and error reporting, and a data region for input/output payloads.
Address Map (relative to base address)
0x000–0x03F Control Region — Command, config, parameters
0x040–0x07F Status Region — Status, interrupt, error
0x080–0x0FF Info Region — Core ID, version, capabilities
0x100–0x1FF Reserved
0x200–0xFFF Data Region — Input/output buffers (core-specific size)
Common Register Map #
The following registers are present in every Dyber IP core at consistent offsets. Core-specific extension registers (algorithm parameters, DMA configuration) begin at offset 0x100.
| Offset | Name | R/W | Description |
|---|---|---|---|
0x000 | CTRL | W | Command register. Write operation code to initiate. Bit[7:0] = operation, Bit[15:8] = parameter set. |
0x004 | CONFIG | R/W | Core configuration. Interrupt enable, DMA mode, clock gating policy. |
0x008 | PARAM | R/W | Algorithm parameter selection. Security level, NTT configuration, masking enable. |
0x00C | INPUT_LEN | R/W | Input data length in bytes. Set before writing data to the input buffer. |
0x040 | STATUS | R | Core status. Bit[0] = busy, Bit[1] = done, Bit[2] = error, Bit[3] = ready. |
0x044 | IRQ_STATUS | R/W1C | Interrupt status (write-1-to-clear). Bit[0] = operation complete, Bit[1] = error, Bit[2] = fault detect. |
0x048 | IRQ_ENABLE | R/W | Interrupt enable mask. Same bit mapping as IRQ_STATUS. |
0x04C | ERROR | R | Error code. Non-zero when STATUS.error is set. See Error Codes. |
0x050 | OUTPUT_LEN | R | Output data length in bytes. Valid after operation completes. |
0x080 | CORE_ID | R | Core identifier. Uniquely identifies the IP core type (e.g., 0x4D4B454D = "MKEM"). |
0x084 | VERSION | R | IP core version. Bit[31:16] = major, Bit[15:8] = minor, Bit[7:0] = patch. |
0x088 | CAPS | R | Capability flags. Bit[0] = DMA support, Bit[1] = masking present, Bit[2] = FI-detect present. |
0x08C | BUILD | R | Build timestamp and configuration hash for tracking specific RTL builds. |
Command Protocol #
The command protocol follows a write-trigger model. The host CPU configures the operation, loads input data, and writes the command register to start execution.
// Example: ML-KEM-768 Encapsulation
// 1. Verify core is idle
while (read(STATUS) & 0x1) { } // Wait for busy=0
// 2. Set parameter (ML-KEM-768)
write(PARAM, 0x03); // Security level 3
// 3. Load public key into input buffer
write(INPUT_LEN, 1184); // ML-KEM-768 public key size
for (i = 0; i < 1184; i += 4)
write(DATA_IN + i, pk_word[i/4]);
// 4. Trigger encapsulation
write(CTRL, 0x02); // OP_ENCAPS = 0x02
// 5. Wait for completion (poll or interrupt)
while (!(read(STATUS) & 0x2)) { } // Wait for done=1
// 6. Read ciphertext + shared secret
ct_len = read(OUTPUT_LEN); // 1088 + 32 = 1120 bytes
for (i = 0; i < ct_len; i += 4)
output[i/4] = read(DATA_OUT + i);
Operation codes (CTRL register bits [7:0]):
| Code | Operation | Applicable Cores |
|---|---|---|
0x01 | KeyGen | MLKEM, MLDSA, SLH |
0x02 | Encapsulate | MLKEM, HKEM |
0x03 | Decapsulate | MLKEM, HKEM |
0x04 | Sign | MLDSA, SLH |
0x05 | Verify | MLDSA, SLH, SBOOT |
0x06 | Hash | SHA3-HASH |
0x07 | XOF Squeeze | SHAKE-XOF |
0x10 | NTT Forward | NTT-R2/R4/R8/R16/R32 |
0x11 | NTT Inverse | NTT-R2/R4/R8/R16/R32 |
0x20 | Self-Test (KAT) | All cores |
0x21 | Zeroize | All cores with key buffer, KMU |
0xFF | Soft Reset | All cores |
Status & Interrupts #
Two completion notification methods are supported:
Polling: Read the STATUS register until the done bit is set. For deterministic-latency operations (all except ML-DSA Sign), the exact cycle count is known in advance, so the host can perform other work and check at the expected completion time.
Interrupt-driven: Enable the completion interrupt via IRQ_ENABLE, then wait for the hardware interrupt. The ISR reads IRQ_STATUS to determine the event type and clears the interrupt by writing 1 to the corresponding bit. Interrupt coalescing (raise interrupt after N completions) is available for DMA-mode operation.
Interrupt signals: Each IP core drives a single active-high level interrupt output (irq_o) that is asserted when any enabled interrupt condition is active. The interrupt remains asserted until the host clears all pending conditions in IRQ_STATUS.
Data Transfer Modes #
| Mode | Interface | Best For |
|---|---|---|
| Register I/O | AXI4-Lite / APB | Low-throughput, simple integration. CPU writes/reads data word-by-word through the data region. |
| DMA Burst | AXI4 Full | High-throughput. System DMA controller transfers data between system memory and IP data buffers. |
| Scatter-Gather | AXI4 Full + descriptors | Batch operations. IP autonomously processes a chain of operations from a descriptor ring. |
| Streaming | AXI4-Stream | Pipeline integration. Data flows through IP continuously without store-and-forward. |
AXI4-Lite Interface #
The AXI4-Lite interface is the primary control interface and is present on every IP core. It provides register-level access to all control, status, and data registers.
| Signal | Direction | Width | Description |
|---|---|---|---|
s_axi_aclk | Input | 1 | AXI clock |
s_axi_aresetn | Input | 1 | Active-low reset |
s_axi_awaddr | Input | 12+ | Write address |
s_axi_awvalid | Input | 1 | Write address valid |
s_axi_awready | Output | 1 | Write address ready |
s_axi_wdata | Input | 32 | Write data |
s_axi_wstrb | Input | 4 | Write byte enables |
s_axi_wvalid | Input | 1 | Write data valid |
s_axi_wready | Output | 1 | Write data ready |
s_axi_bresp | Output | 2 | Write response (OKAY/SLVERR) |
s_axi_bvalid | Output | 1 | Write response valid |
s_axi_bready | Input | 1 | Write response ready |
s_axi_araddr | Input | 12+ | Read address |
s_axi_arvalid | Input | 1 | Read address valid |
s_axi_arready | Output | 1 | Read address ready |
s_axi_rdata | Output | 32 | Read data |
s_axi_rresp | Output | 2 | Read response (OKAY/SLVERR) |
s_axi_rvalid | Output | 1 | Read data valid |
s_axi_rready | Input | 1 | Read data ready |
irq_o | Output | 1 | Active-high level interrupt |
AXI4-Stream Interface #
Available on algorithm accelerators and hash cores for high-throughput streaming operation. Separate input and output stream ports.
| Signal | Direction | Width | Description |
|---|---|---|---|
s_axis_tdata | Input | 64/128/256 | Input data (configurable width) |
s_axis_tkeep | Input | N/8 | Byte qualifiers |
s_axis_tlast | Input | 1 | End of packet/message |
s_axis_tvalid | Input | 1 | Input valid |
s_axis_tready | Output | 1 | Input ready (backpressure) |
m_axis_tdata | Output | 64/128/256 | Output data |
m_axis_tkeep | Output | N/8 | Byte qualifiers |
m_axis_tlast | Output | 1 | End of result |
m_axis_tvalid | Output | 1 | Output valid |
m_axis_tready | Input | 1 | Output ready (backpressure) |
The AXI4-Stream data width is configurable at synthesis time (64, 128, or 256 bits). Wider data paths increase throughput at the cost of additional routing resources.
APB Interface #
AMBA APB interface for low-power and area-constrained designs. Provides the same register map as AXI4-Lite with reduced signal count and simpler protocol (no burst, no pipelining). Suitable for IoT and microcontroller applications using the NTT-R2 or ML-KEM-512 configurations.
DMA Descriptor Format #
For scatter-gather DMA operation, each descriptor describes one cryptographic operation:
Descriptor (32 bytes):
Offset 0x00 [31:0] Input buffer physical address (low 32 bits)
Offset 0x04 [31:0] Input buffer physical address (high 32 bits)
Offset 0x08 [31:0] Output buffer physical address (low 32 bits)
Offset 0x0C [31:0] Output buffer physical address (high 32 bits)
Offset 0x10 [15:0] Input length (bytes)
[23:16] Operation code (same as CTRL register)
[31:24] Parameter set
Offset 0x14 [31:0] Next descriptor address (low 32 bits)
Offset 0x18 [31:0] Next descriptor address (high 32 bits)
Offset 0x1C [0] Interrupt on completion
[1] Last descriptor in chain
[31:2] Reserved
Error Codes #
| Code | Name | Description |
|---|---|---|
0x00 | SUCCESS | Operation completed successfully |
0x01 | ERR_INVALID_OP | Unsupported operation code for this core |
0x02 | ERR_INVALID_PARAM | Invalid parameter set or security level |
0x03 | ERR_INPUT_LEN | Input data length does not match expected size |
0x04 | ERR_BUSY | Command issued while core is processing |
0x05 | ERR_VERIFY_FAIL | Signature verification failed (valid result, not an error) |
0x06 | ERR_SELF_TEST | Power-on self-test (KAT) failure |
0x07 | ERR_FAULT_DETECT | Fault injection detected — core locked |
0x08 | ERR_KEY_SLOT | Invalid key slot or key not present (KMU) |
0x09 | ERR_PERMISSION | Operation not permitted for this key slot (KMU) |
0x0A | ERR_ENTROPY | Entropy source health test failure (QRNG) |
0x0B | ERR_DMA | DMA transfer error (invalid address, bus error) |
0xFF | ERR_INTERNAL | Internal error — contact Dyber support |
Software Driver API #
The reference C driver provides a high-level API that abstracts register-level operations:
/* Core initialization */
dyber_status_t dyber_init(dyber_ctx_t *ctx, uintptr_t base_addr);
dyber_status_t dyber_self_test(dyber_ctx_t *ctx);
/* ML-KEM operations */
dyber_status_t dyber_mlkem_keygen(dyber_ctx_t *ctx, uint8_t level,
uint8_t *pk, uint8_t *sk);
dyber_status_t dyber_mlkem_encaps(dyber_ctx_t *ctx, uint8_t level,
const uint8_t *pk,
uint8_t *ct, uint8_t *ss);
dyber_status_t dyber_mlkem_decaps(dyber_ctx_t *ctx, uint8_t level,
const uint8_t *sk, const uint8_t *ct,
uint8_t *ss);
/* ML-DSA operations */
dyber_status_t dyber_mldsa_keygen(dyber_ctx_t *ctx, uint8_t level,
uint8_t *pk, uint8_t *sk);
dyber_status_t dyber_mldsa_sign(dyber_ctx_t *ctx, uint8_t level,
const uint8_t *sk, const uint8_t *msg,
size_t msg_len, uint8_t *sig, size_t *sig_len);
dyber_status_t dyber_mldsa_verify(dyber_ctx_t *ctx, uint8_t level,
const uint8_t *pk, const uint8_t *msg,
size_t msg_len, const uint8_t *sig,
size_t sig_len);
/* Key management (DYBER-KMU) */
dyber_status_t dyber_kmu_generate(dyber_ctx_t *ctx, uint16_t slot,
dyber_key_type_t type, uint8_t level);
dyber_status_t dyber_kmu_zeroize(dyber_ctx_t *ctx, uint16_t slot);
dyber_status_t dyber_kmu_zeroize_all(dyber_ctx_t *ctx);
/* Cleanup */
void dyber_cleanup(dyber_ctx_t *ctx);
The driver API is architecture-independent — the same function signatures compile on x86-64, ARM64, RISC-V, or any platform with a C compiler and memory-mapped I/O. Platform-specific register access macros are isolated in a thin HAL (Hardware Abstraction Layer) that is provided for Linux, bare-metal, and FreeRTOS environments.
Was this page helpful? Send feedback