论文标题
TB/S极性连续取消解码器16NM ASIC实施
Tb/s Polar Successive Cancellation Decoder 16nm ASIC Implementation
论文作者
论文摘要
这项工作提出了对极地代码的连续取消(SC)解码器的有效实施。 SC是一种低复杂性深度优先的搜索解码算法,有利于超过5G应用程序,需要极高的吞吐量和低功率。 SC在这项工作中的ASIC实施利用了许多技术,包括管道和展开,以实现TB/S数据吞吐量,而不会损害功率和面积指标。为了降低实施的复杂性,使用了自适应对数可能性比率(LLR)量化方案。该方案通过考虑SC解码器中LLR分布的不规则极化和熵的不规则偏振和熵,优化了内部LLR的位精度。当代码块长度为1024位,有效载荷为854位时,该方案的性能成本小于0.2 dB。此外,SC中的某些计算具有高度并行化的大空间,而其他计算则采取更长的时间步骤。为了优化这些计算并减少内存和延迟,使用寄存器减少/平衡(R-RB)方法。最终解码器架构称为优化极地SC(OPSC)。 16nm FinFET ASIC技术的平台路线结果表明,OPSC解码器在0.79毫米$ $^2 $面积上实现1.2 TB/S的编码吞吐量,其面积为0.95 PJ/BIT能量效率。
This work presents an efficient ASIC implementation of successive cancellation (SC) decoder for polar codes. SC is a low-complexity depth-first search decoding algorithm, favorable for beyond-5G applications that require extremely high throughput and low power. The ASIC implementation of SC in this work exploits many techniques including pipelining and unrolling to achieve Tb/s data throughput without compromising power and area metrics. To reduce the complexity of the implementation, an adaptive log-likelihood ratio (LLR) quantization scheme is used. This scheme optimizes bit precision of the internal LLRs within the range of 1-5 bits by considering irregular polarization and entropy of LLR distribution in SC decoder. The performance cost of this scheme is less than 0.2 dB when the code block length is 1024 bits and the payload is 854 bits. Furthermore, some computations in SC take large space with high degree of parallelization while others take longer time steps. To optimize these computations and reduce both memory and latency, register reduction/balancing (R-RB) method is used. The final decoder architecture is called optimized polar SC (OPSC). The post-placement-routing results at 16nm FinFet ASIC technology show that OPSC decoder achieves 1.2 Tb/s coded throughput on 0.79 mm$^2$ area with 0.95 pJ/bit energy efficiency.