QCORE-C1 Developer Guide
Register-level programming guide for the QCORE-C1 chiplet covering the MMIO register map, DMA configuration, interrupt handling, and firmware development for bare-metal, RTOS, and Linux environments.
Register Map Overview #
The QCORE-C1 exposes a 64KB MMIO register space accessible through the QLI management sideband interface or JTAG debug port. All registers are 32-bit aligned with little-endian byte ordering.
| Offset Range | Block | Description |
|---|---|---|
0x0000–0x00FF | System Control | Chip ID, version, global control, status, reset |
0x0100–0x01FF | QLI Controller | Link control/status, credits, counters, power states |
0x0200–0x02FF | NTT Array | NTT control, configuration, lane status, debug |
0x0300–0x03FF | Keccak Core | Hash control, mode selection, state access |
0x0400–0x04FF | Kyber FSM | Operation dispatch, parameter selection, status |
0x0500–0x05FF | DMA Engine | Descriptor rings, source/dest addresses, control |
0x0600–0x06FF | Interrupt Controller | Enable, status, clear, coalescing timers |
0x0700–0x07FF | Security | Tamper status, zeroization trigger, key management |
0x0800–0x0FFF | Performance Counters | Cycle counters, operation counts, stall metrics |
0x1000–0xFFFF | Reserved / SRAM Window | Direct SRAM access for debug (JTAG only) |
System Control Registers #
| Offset | Name | R/W | Reset | Description |
|---|---|---|---|---|
0x000 | CHIP_ID | RO | 0x44594231 | Chip identification ("DYB1" ASCII) |
0x004 | CHIP_VERSION | RO | 0x00000905 | RTL version (major.minor in BCD) |
0x008 | CHIP_STATUS | RO | 0x00000000 | Bit 0: NTT ready, Bit 1: QLI up, Bit 2: Keccak ready, Bit 3: SRAM initialized |
0x00C | CHIP_CTRL | RW | 0x00000000 | Bit 0: Soft reset, Bit 1: Clock gate override, Bit 4–7: Power mode |
0x010 | CHIP_FEATURES | RO | 0x0000001F | Feature flags: NTT[0], Keccak[1], CBD[2], QLI[3], DMA[4] |
0x014 | NTT_CONFIG | RO | 0x00080010 | Bits 7:0 = NTT lanes (8), Bits 23:16 = Radix (16) |
Operation Dispatch #
ML-KEM operations are submitted through the Kyber FSM registers. A command-response model is used: write the operation parameters, write the command register to trigger, then poll (or wait for interrupt) until completion.
// ML-KEM-768 Key Generation — register-level example
// 1. Select parameter set
write_reg(0x0400, 0x00000003); // KYBER_PARAM = ML-KEM-768 (k=3)
// 2. Set output buffer address (SRAM-relative)
write_reg(0x0408, 0x00002000); // KYBER_PK_ADDR = public key output
write_reg(0x040C, 0x00004000); // KYBER_SK_ADDR = secret key output
// 3. Provide seed (32 bytes via KYBER_SEED registers 0x0420–0x043F)
for (int i = 0; i < 8; i++)
write_reg(0x0420 + i*4, seed[i]);
// 4. Trigger KeyGen
write_reg(0x0404, 0x00000001); // KYBER_CMD = KEYGEN
// 5. Wait for completion
while (!(read_reg(0x0410) & 0x01)) // KYBER_STATUS.done
; // Or use interrupt
// 6. Read result status
uint32_t status = read_reg(0x0410);
// Bit 0: done, Bit 1: error, Bits 7:4: error code
| Command | Value | Description |
|---|---|---|
| KEYGEN | 0x01 | Generate ML-KEM key pair |
| ENCAPS | 0x02 | Encapsulate (requires public key + randomness) |
| DECAPS | 0x03 | Decapsulate (requires secret key + ciphertext) |
| NTT_FWD | 0x10 | Raw forward NTT on polynomial (debug) |
| NTT_INV | 0x11 | Raw inverse NTT on polynomial (debug) |
| HASH | 0x20 | SHA-3/SHAKE hash operation (debug) |
| ZEROIZE | 0xFF | Zeroize all key material in SRAM |
DMA Engine #
The DMA engine transfers data between the QLI interface and internal SRAM without CPU intervention. It supports scatter-gather descriptor rings for efficient bulk key exchange operations.
| Offset | Name | R/W | Description |
|---|---|---|---|
0x500 | DMA_CTRL | RW | Bit 0: Enable, Bit 1: Direction (0=RX, 1=TX), Bit 4: Scatter-gather |
0x504 | DMA_STATUS | RO | Bit 0: Idle, Bit 1: Active, Bit 2: Error, Bits 15:8: Pending descriptors |
0x508 | DMA_SRC_ADDR | RW | Source address (QLI address or SRAM offset) |
0x50C | DMA_DST_ADDR | RW | Destination address |
0x510 | DMA_LENGTH | RW | Transfer length in bytes (max 4096) |
0x514 | DMA_DESC_BASE | RW | Scatter-gather descriptor ring base (SRAM offset) |
0x518 | DMA_DESC_COUNT | RW | Number of descriptors in ring (max 32) |
0x51C | DMA_DESC_HEAD | RW | Producer index (host writes) |
0x520 | DMA_DESC_TAIL | RO | Consumer index (hardware updates) |
Interrupt Controller #
| Bit | Name | Description |
|---|---|---|
| 0 | KYBER_DONE | ML-KEM operation completed |
| 1 | KYBER_ERROR | ML-KEM operation error |
| 2 | DMA_DONE | DMA transfer completed |
| 3 | DMA_ERROR | DMA transfer error |
| 4 | QLI_LINK_DOWN | QLI link lost |
| 5 | QLI_CRC_ERROR | QLI CRC error detected |
| 6 | SRAM_ECC_ERROR | SRAM ECC correction or detection |
| 7 | TAMPER_DETECT | Security tamper event |
| Offset | Name | R/W | Description |
|---|---|---|---|
0x600 | INT_STATUS | RO | Active interrupt flags |
0x604 | INT_ENABLE | RW | Interrupt enable mask |
0x608 | INT_CLEAR | W1C | Write 1 to clear interrupt |
0x60C | INT_COALESCE | RW | Bits 15:0 = timer (μs), Bits 23:16 = count threshold |
Firmware Development #
Bare-Metal
For bare-metal environments, the libqcore C library provides direct register access through memory-mapped I/O. The library includes a complete ML-KEM API, DMA management, and interrupt handling. The library has no external dependencies and compiles with any C99 compiler.
// Bare-metal ML-KEM-768 example (C)
#include <libqcore/qcore.h>
#include <libqcore/mlkem.h>
int main(void) {
qcore_t *dev = qcore_init(QCORE_BASE_ADDR);
// Generate keypair
mlkem_keypair_t kp;
mlkem_keygen(dev, MLKEM_768, &kp);
// Encapsulate
mlkem_encaps_result_t enc;
mlkem_encaps(dev, MLKEM_768, kp.pk, &enc);
// Decapsulate
uint8_t ss[32];
mlkem_decaps(dev, MLKEM_768, enc.ct, kp.sk, ss);
// Verify
assert(memcmp(enc.ss, ss, 32) == 0);
qcore_zeroize(dev); // Wipe key material
return 0;
}
Linux Kernel Driver
A Linux kernel module (qcore_c1.ko) is available that exposes the QCORE-C1 as a character device (/dev/qcore0) with ioctl commands for ML-KEM operations. The driver supports multi-process access with per-context key isolation and integrates with the Linux Crypto API (AF_ALG) for transparent application-layer use.
RTOS Integration
The libqcore library is RTOS-agnostic. For FreeRTOS, Zephyr, or ThreadX environments, register the QCORE-C1 interrupt handler and use the provided semaphore-based completion API for non-blocking operation dispatch.