Interface Reference

IP-IF-001 v2.0

This reference documents the hardware and software interfaces shared by all Dyber PQC IP cores. Every core follows a consistent register map layout, command protocol, and programming model — enabling a single driver framework to operate any core in the portfolio.

Overview #

All Dyber IP cores present a unified programming interface regardless of the underlying algorithm or bus wrapper. The interface is organized into three regions: a control region for command dispatch and configuration, a status region for completion and error reporting, and a data region for input/output payloads.

Address Map (relative to base address)
0x000–0x03F   Control Region    — Command, config, parameters
0x040–0x07F   Status Region     — Status, interrupt, error
0x080–0x0FF   Info Region       — Core ID, version, capabilities
0x100–0x1FF   Reserved
0x200–0xFFF   Data Region       — Input/output buffers (core-specific size)

Common Register Map #

The following registers are present in every Dyber IP core at consistent offsets. Core-specific extension registers (algorithm parameters, DMA configuration) begin at offset 0x100.

OffsetNameR/WDescription
0x000CTRLWCommand register. Write operation code to initiate. Bit[7:0] = operation, Bit[15:8] = parameter set.
0x004CONFIGR/WCore configuration. Interrupt enable, DMA mode, clock gating policy.
0x008PARAMR/WAlgorithm parameter selection. Security level, NTT configuration, masking enable.
0x00CINPUT_LENR/WInput data length in bytes. Set before writing data to the input buffer.
0x040STATUSRCore status. Bit[0] = busy, Bit[1] = done, Bit[2] = error, Bit[3] = ready.
0x044IRQ_STATUSR/W1CInterrupt status (write-1-to-clear). Bit[0] = operation complete, Bit[1] = error, Bit[2] = fault detect.
0x048IRQ_ENABLER/WInterrupt enable mask. Same bit mapping as IRQ_STATUS.
0x04CERRORRError code. Non-zero when STATUS.error is set. See Error Codes.
0x050OUTPUT_LENROutput data length in bytes. Valid after operation completes.
0x080CORE_IDRCore identifier. Uniquely identifies the IP core type (e.g., 0x4D4B454D = "MKEM").
0x084VERSIONRIP core version. Bit[31:16] = major, Bit[15:8] = minor, Bit[7:0] = patch.
0x088CAPSRCapability flags. Bit[0] = DMA support, Bit[1] = masking present, Bit[2] = FI-detect present.
0x08CBUILDRBuild timestamp and configuration hash for tracking specific RTL builds.

Command Protocol #

The command protocol follows a write-trigger model. The host CPU configures the operation, loads input data, and writes the command register to start execution.

// Example: ML-KEM-768 Encapsulation

// 1. Verify core is idle
while (read(STATUS) & 0x1) { }   // Wait for busy=0

// 2. Set parameter (ML-KEM-768)
write(PARAM, 0x03);               // Security level 3

// 3. Load public key into input buffer
write(INPUT_LEN, 1184);           // ML-KEM-768 public key size
for (i = 0; i < 1184; i += 4)
    write(DATA_IN + i, pk_word[i/4]);

// 4. Trigger encapsulation
write(CTRL, 0x02);                // OP_ENCAPS = 0x02

// 5. Wait for completion (poll or interrupt)
while (!(read(STATUS) & 0x2)) { } // Wait for done=1

// 6. Read ciphertext + shared secret
ct_len = read(OUTPUT_LEN);        // 1088 + 32 = 1120 bytes
for (i = 0; i < ct_len; i += 4)
    output[i/4] = read(DATA_OUT + i);

Operation codes (CTRL register bits [7:0]):

CodeOperationApplicable Cores
0x01KeyGenMLKEM, MLDSA, SLH
0x02EncapsulateMLKEM, HKEM
0x03DecapsulateMLKEM, HKEM
0x04SignMLDSA, SLH
0x05VerifyMLDSA, SLH, SBOOT
0x06HashSHA3-HASH
0x07XOF SqueezeSHAKE-XOF
0x10NTT ForwardNTT-R2/R4/R8/R16/R32
0x11NTT InverseNTT-R2/R4/R8/R16/R32
0x20Self-Test (KAT)All cores
0x21ZeroizeAll cores with key buffer, KMU
0xFFSoft ResetAll cores

Status & Interrupts #

Two completion notification methods are supported:

Polling: Read the STATUS register until the done bit is set. For deterministic-latency operations (all except ML-DSA Sign), the exact cycle count is known in advance, so the host can perform other work and check at the expected completion time.

Interrupt-driven: Enable the completion interrupt via IRQ_ENABLE, then wait for the hardware interrupt. The ISR reads IRQ_STATUS to determine the event type and clears the interrupt by writing 1 to the corresponding bit. Interrupt coalescing (raise interrupt after N completions) is available for DMA-mode operation.

Interrupt signals: Each IP core drives a single active-high level interrupt output (irq_o) that is asserted when any enabled interrupt condition is active. The interrupt remains asserted until the host clears all pending conditions in IRQ_STATUS.

Data Transfer Modes #

ModeInterfaceBest For
Register I/OAXI4-Lite / APBLow-throughput, simple integration. CPU writes/reads data word-by-word through the data region.
DMA BurstAXI4 FullHigh-throughput. System DMA controller transfers data between system memory and IP data buffers.
Scatter-GatherAXI4 Full + descriptorsBatch operations. IP autonomously processes a chain of operations from a descriptor ring.
StreamingAXI4-StreamPipeline integration. Data flows through IP continuously without store-and-forward.

AXI4-Lite Interface #

The AXI4-Lite interface is the primary control interface and is present on every IP core. It provides register-level access to all control, status, and data registers.

SignalDirectionWidthDescription
s_axi_aclkInput1AXI clock
s_axi_aresetnInput1Active-low reset
s_axi_awaddrInput12+Write address
s_axi_awvalidInput1Write address valid
s_axi_awreadyOutput1Write address ready
s_axi_wdataInput32Write data
s_axi_wstrbInput4Write byte enables
s_axi_wvalidInput1Write data valid
s_axi_wreadyOutput1Write data ready
s_axi_brespOutput2Write response (OKAY/SLVERR)
s_axi_bvalidOutput1Write response valid
s_axi_breadyInput1Write response ready
s_axi_araddrInput12+Read address
s_axi_arvalidInput1Read address valid
s_axi_arreadyOutput1Read address ready
s_axi_rdataOutput32Read data
s_axi_rrespOutput2Read response (OKAY/SLVERR)
s_axi_rvalidOutput1Read data valid
s_axi_rreadyInput1Read data ready
irq_oOutput1Active-high level interrupt

AXI4-Stream Interface #

Available on algorithm accelerators and hash cores for high-throughput streaming operation. Separate input and output stream ports.

SignalDirectionWidthDescription
s_axis_tdataInput64/128/256Input data (configurable width)
s_axis_tkeepInputN/8Byte qualifiers
s_axis_tlastInput1End of packet/message
s_axis_tvalidInput1Input valid
s_axis_treadyOutput1Input ready (backpressure)
m_axis_tdataOutput64/128/256Output data
m_axis_tkeepOutputN/8Byte qualifiers
m_axis_tlastOutput1End of result
m_axis_tvalidOutput1Output valid
m_axis_treadyInput1Output ready (backpressure)

The AXI4-Stream data width is configurable at synthesis time (64, 128, or 256 bits). Wider data paths increase throughput at the cost of additional routing resources.

APB Interface #

AMBA APB interface for low-power and area-constrained designs. Provides the same register map as AXI4-Lite with reduced signal count and simpler protocol (no burst, no pipelining). Suitable for IoT and microcontroller applications using the NTT-R2 or ML-KEM-512 configurations.

DMA Descriptor Format #

For scatter-gather DMA operation, each descriptor describes one cryptographic operation:

Descriptor (32 bytes):
Offset 0x00  [31:0]  Input buffer physical address (low 32 bits)
Offset 0x04  [31:0]  Input buffer physical address (high 32 bits)
Offset 0x08  [31:0]  Output buffer physical address (low 32 bits)
Offset 0x0C  [31:0]  Output buffer physical address (high 32 bits)
Offset 0x10  [15:0]  Input length (bytes)
              [23:16] Operation code (same as CTRL register)
              [31:24] Parameter set
Offset 0x14  [31:0]  Next descriptor address (low 32 bits)
Offset 0x18  [31:0]  Next descriptor address (high 32 bits)
Offset 0x1C  [0]     Interrupt on completion
              [1]     Last descriptor in chain
              [31:2]  Reserved

Error Codes #

CodeNameDescription
0x00SUCCESSOperation completed successfully
0x01ERR_INVALID_OPUnsupported operation code for this core
0x02ERR_INVALID_PARAMInvalid parameter set or security level
0x03ERR_INPUT_LENInput data length does not match expected size
0x04ERR_BUSYCommand issued while core is processing
0x05ERR_VERIFY_FAILSignature verification failed (valid result, not an error)
0x06ERR_SELF_TESTPower-on self-test (KAT) failure
0x07ERR_FAULT_DETECTFault injection detected — core locked
0x08ERR_KEY_SLOTInvalid key slot or key not present (KMU)
0x09ERR_PERMISSIONOperation not permitted for this key slot (KMU)
0x0AERR_ENTROPYEntropy source health test failure (QRNG)
0x0BERR_DMADMA transfer error (invalid address, bus error)
0xFFERR_INTERNALInternal error — contact Dyber support

Software Driver API #

The reference C driver provides a high-level API that abstracts register-level operations:

/* Core initialization */
dyber_status_t dyber_init(dyber_ctx_t *ctx, uintptr_t base_addr);
dyber_status_t dyber_self_test(dyber_ctx_t *ctx);

/* ML-KEM operations */
dyber_status_t dyber_mlkem_keygen(dyber_ctx_t *ctx, uint8_t level,
                                  uint8_t *pk, uint8_t *sk);
dyber_status_t dyber_mlkem_encaps(dyber_ctx_t *ctx, uint8_t level,
                                  const uint8_t *pk,
                                  uint8_t *ct, uint8_t *ss);
dyber_status_t dyber_mlkem_decaps(dyber_ctx_t *ctx, uint8_t level,
                                  const uint8_t *sk, const uint8_t *ct,
                                  uint8_t *ss);

/* ML-DSA operations */
dyber_status_t dyber_mldsa_keygen(dyber_ctx_t *ctx, uint8_t level,
                                  uint8_t *pk, uint8_t *sk);
dyber_status_t dyber_mldsa_sign(dyber_ctx_t *ctx, uint8_t level,
                                const uint8_t *sk, const uint8_t *msg,
                                size_t msg_len, uint8_t *sig, size_t *sig_len);
dyber_status_t dyber_mldsa_verify(dyber_ctx_t *ctx, uint8_t level,
                                  const uint8_t *pk, const uint8_t *msg,
                                  size_t msg_len, const uint8_t *sig,
                                  size_t sig_len);

/* Key management (DYBER-KMU) */
dyber_status_t dyber_kmu_generate(dyber_ctx_t *ctx, uint16_t slot,
                                  dyber_key_type_t type, uint8_t level);
dyber_status_t dyber_kmu_zeroize(dyber_ctx_t *ctx, uint16_t slot);
dyber_status_t dyber_kmu_zeroize_all(dyber_ctx_t *ctx);

/* Cleanup */
void dyber_cleanup(dyber_ctx_t *ctx);

The driver API is architecture-independent — the same function signatures compile on x86-64, ARM64, RISC-V, or any platform with a C compiler and memory-mapped I/O. Platform-specific register access macros are isolated in a thin HAL (Hardware Abstraction Layer) that is provided for Linux, bare-metal, and FreeRTOS environments.

Complete API documentation including all function signatures, parameter descriptions, return codes, and usage examples is provided as a Doxygen-generated reference in the IP evaluation package.