论文标题
CV 3315就是您所需要的:语义细分竞赛
CV 3315 Is All You Need : Semantic Segmentation Competition
论文作者
论文摘要
这项竞争重点是基于车辆摄像头视图的城市义细分。类高度不平衡的城市态图像数据集挑战了现有的解决方案和进一步的研究。深度传统的基于神经网络的语义分割方法,例如编码器架构以及基于金字塔的多尺度和基于金字塔的方法,成为适用于现实世界应用程序的灵活解决方案。在这项竞赛中,我们主要回顾有关变压器驱动方法尤其是Segformer的文献和进行实验,以实现绩效和效率之间的最佳权衡。例如,Segformer-B0以最小的Flops,15.6G和最大的模型,Segformer-B5存档的80.2%MIOU获得了74.6%MIOU。根据多个因素,包括个体案例失败分析,个体班级绩效,训练压力和效率估计,竞争的最终候选模型为Segformer-B2,在测试集上评估了50.6 GFLOPS和78.5%MIOU。在https://vmv.re/cv3315上查看我们的代码实现。
This competition focus on Urban-Sense Segmentation based on the vehicle camera view. Class highly unbalanced Urban-Sense images dataset challenge the existing solutions and further studies. Deep Conventional neural network-based semantic segmentation methods such as encoder-decoder architecture and multi-scale and pyramid-based approaches become flexible solutions applicable to real-world applications. In this competition, we mainly review the literature and conduct experiments on transformer-driven methods especially SegFormer, to achieve an optimal trade-off between performance and efficiency. For example, SegFormer-B0 achieved 74.6% mIoU with the smallest FLOPS, 15.6G, and the largest model, SegFormer- B5 archived 80.2% mIoU. According to multiple factors, including individual case failure analysis, individual class performance, training pressure and efficiency estimation, the final candidate model for the competition is SegFormer- B2 with 50.6 GFLOPS and 78.5% mIoU evaluated on the testing set. Checkout our code implementation at https://vmv.re/cv3315.