论文标题
通用语言电感偏见通过元学习
Universal linguistic inductive biases via meta-learning
论文作者
论文摘要
学习者如何从可用的有限数据中获取语言?这个过程必须涉及一些归纳偏见 - 影响学习者概括的因素 - 但尚不清楚哪些归纳偏见可以解释语言获取中观察到的模式。为了促进旨在解决这个问题的计算建模,我们引入了一个框架,以使神经网络模型给予特定的语言诱导偏见。然后,这样的模型可以用于凭经验探索这些电感偏见的影响。该框架将通用归纳偏见分开,这些偏见是从神经网络参数的初始值中编码的,与非普遍的因素相关,神经网络必须从给定语言中从数据中学习。通过元学习发现了编码归纳偏见的初始状态,该技术通过该技术发现了如何通过接触许多可能的语言来更轻松地获取新语言。通过控制元学习期间使用的语言的属性,我们可以控制元学习赋予的归纳偏差。我们通过基于音节结构的案例研究证明了这一框架。首先,我们指定了我们打算给出模型的归纳偏见,然后将这些感应偏见转化为一种模型可以元学习的语言空间。最后,使用现有的分析技术,我们验证了我们的方法是否赋予了其旨在赋予的语言归纳偏见。
How do learners acquire languages from the limited data available to them? This process must involve some inductive biases - factors that affect how a learner generalizes - but it is unclear which inductive biases can explain observed patterns in language acquisition. To facilitate computational modeling aimed at addressing this question, we introduce a framework for giving particular linguistic inductive biases to a neural network model; such a model can then be used to empirically explore the effects of those inductive biases. This framework disentangles universal inductive biases, which are encoded in the initial values of a neural network's parameters, from non-universal factors, which the neural network must learn from data in a given language. The initial state that encodes the inductive biases is found with meta-learning, a technique through which a model discovers how to acquire new languages more easily via exposure to many possible languages. By controlling the properties of the languages that are used during meta-learning, we can control the inductive biases that meta-learning imparts. We demonstrate this framework with a case study based on syllable structure. First, we specify the inductive biases that we intend to give our model, and then we translate those inductive biases into a space of languages from which a model can meta-learn. Finally, using existing analysis techniques, we verify that our approach has imparted the linguistic inductive biases that it was intended to impart.