论文标题

最长的常见子序列:子序列概率的表格与闭合形式方程计算

Longest Common Subsequence: Tabular vs. Closed-Form Equation Computation of Subsequence Probability

论文作者

Abdi, Alireza, Hooshmand, Mohsen

论文摘要

最长的常见子序列问题(LCS)涉及在给定的一组字符组中找到最长的子序列。 LCS问题是一个NP硬性问题,它使其成为努力使用启发式方法找到更好解决方案的目标。最著名的启发式函数的基线是一种表格随机的,概率的方法。该方法近似于每次迭代中LC的长度。光束搜索和基于概率的启发式方法的结合导致了解决LCS问题的算法中的大量建议和成就。在这项工作中,我们首次引入了概率表计算的封闭式方程。此外,我们提出了其他相应的封闭形式方程式,并证明了所有方程。封闭形式方程为分析和进一步近似打开了新的方式。使用定理和光束搜索,我们提出了一种分析方法,用于估计其余子序列的LCS长度。此外,我们根据变异系数提出了另一种启发式功能。结果表明,我们所提出的方法的表现优于LCS问题的最新方法。

The Longest Common Subsequence Problem (LCS) deals with finding the longest subsequence among a given set of strings. The LCS problem is an NP-hard problem which makes it a target for lots of effort to find a better solution with heuristics methods. The baseline for most famous heuristics functions is a tabular random, probabilistic approach. This approach approximates the length of the LCS in each iteration. The combination of beam search and tabular probabilistic-based heuristics has led to a large number of proposals and achievements in algorithms for solving the LCS problem. In this work, we introduce a closed-form equation of the probabilistic table calculation for the first time. Moreover, we present other corresponding forms of the closed-form equation and prove all of them. The closed-form equation opens new ways for analysis and further approximations. Using the theorems and beam search, we propose an analytic method for estimating the length of the LCS of the remaining subsequence. Furthermore, we present another heuristic function based on the Coefficient of Variation. The results show that our proposed methods outperform the state-of-the-art methods on the LCS problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源