论文标题

$ q $ -ary字符串的换位距离的重复距离

Duplication with transposition distance to the root for $q$-ary strings

论文作者

Polyanskii, Nikita, Vorobyev, Ilya

论文摘要

我们研究了重复,在$ q $ - ary字母上的长度为$ n $的字符串之间的换位距离及其根。换句话说,我们调查了表单$ x =(abcd)\ to y =(abcbd)$的重复操作数量,其中$ x $和$ y $是弦乐,$ a $,$ a $,$ b $,$ c $和$ c $和$ c $和$ c $和$ c $和$ c $和$ c $和$ c $和$ c $和$ d $是他们的substrings,需要从不用duplications的情况下获得$ q $ - Q $ y-q $ y-q $ ntume $ n $ n $。对于确切的重复,我们证明最多$ n $的长度及其根之间的最大距离具有渐近顺序$ n/\ log n $。对于近似重复,如果$β$ - 符号分数可能会错误地重复,我们表明最大距离从$ n/\ log n $ to $β=(q-1)/q $的顺序急剧过渡到$ n/\ log n $ to $ n/\ log n $。这个问题的动机来自基因组学,其中这种重复代表了一种特殊的突变,而给定的生物学序列及其根之间的距离是生成序列所需的最小换位突变。

We study the duplication with transposition distance between strings of length $n$ over a $q$-ary alphabet and their roots. In other words, we investigate the number of duplication operations of the form $x = (abcd) \to y = (abcbd)$, where $x$ and $y$ are strings and $a$, $b$, $c$ and $d$ are their substrings, needed to get a $q$-ary string of length $n$ starting from the set of strings without duplications. For exact duplication, we prove that the maximal distance between a string of length at most $n$ and its root has the asymptotic order $n/\log n$. For approximate duplication, where a $β$-fraction of symbols may be duplicated incorrectly, we show that the maximal distance has a sharp transition from the order $n/\log n$ to $\log n$ at $β=(q-1)/q$. The motivation for this problem comes from genomics, where such duplications represent a special kind of mutation and the distance between a given biological sequence and its root is the smallest number of transposition mutations required to generate the sequence.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源