论文标题
Gentext:通过脱钩字体和纹理操纵无监督的艺术文字生成
GenText: Unsupervised Artistic Text Generation via Decoupled Font and Texture Manipulation
论文作者
论文摘要
自动艺术文本生成是一个新兴的主题,由于其广泛的应用而受到越来越多的关注。艺术文本可以分别分为三个组成部分,内容,字体和纹理。现有的艺术文本生成模型通常致力于操纵上述组件的一个方面,这是可控的一般艺术文本生成的次优解决方案。为了解决这个问题,我们提出了一种新颖的方法,即Gentext,以通过将字体和纹理样式从不同的源图像迁移到目标图像来实现一般的艺术文本样式转移。具体而言,我们当前的工作分别结合了三个不同的阶段,分别是具有单个功能强大的编码网络和两个单独的样式生成器网络的统一平台,一个用于字体传输,另一个用于风格和命运化。命运阶段首先提取字体参考图像的字体样式,然后字体传输阶段使用所需的字体样式生成目标内容。最后,样式阶段相对于参考图像中的纹理样式呈现了所得的字体图像。此外,考虑到配对的艺术文本图像的难以数据采集,我们的模型是在无监督的设置下设计的,可以从未配对的数据中有效地优化所有阶段。定性和定量结果是在艺术文本基准上执行的,这证明了我们提出的模型的出色性能。带有模型的代码将来将公开使用。
Automatic artistic text generation is an emerging topic which receives increasing attention due to its wide applications. The artistic text can be divided into three components, content, font, and texture, respectively. Existing artistic text generation models usually focus on manipulating one aspect of the above components, which is a sub-optimal solution for controllable general artistic text generation. To remedy this issue, we propose a novel approach, namely GenText, to achieve general artistic text style transfer by separably migrating the font and texture styles from the different source images to the target images in an unsupervised manner. Specifically, our current work incorporates three different stages, stylization, destylization, and font transfer, respectively, into a unified platform with a single powerful encoder network and two separate style generator networks, one for font transfer, the other for stylization and destylization. The destylization stage first extracts the font style of the font reference image, then the font transfer stage generates the target content with the desired font style. Finally, the stylization stage renders the resulted font image with respect to the texture style in the reference image. Moreover, considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting, where all stages can be effectively optimized from unpaired data. Qualitative and quantitative results are performed on artistic text benchmarks, which demonstrate the superior performance of our proposed model. The code with models will become publicly available in the future.