English 中文(简体)
如何严格用 tcl 中的regex 来清除重复的 charecter 。
原标题:how to remove duplicate charecter strictly using regexp in tcl
  • 时间:2012-05-22 05:44:39
  •  标签:
  • regex
  • tcl

How to remove duplicate characters in a string strictly using regexp in TCL? e.g., I have a string like this aabbcddeffghh and I need only characters that are "abcdefgh". I tried with lsort unique, i am able to get unique characters:

join [lsort -unique [split $mystring {}]]

但我只需要使用 regexm 命令 。

最佳回答

您无法删除字符串中所有非连续的双重字符, 仅使用 Tcl s < code> regsub 命令。 它不支持在外头序列中获取回引用, 这意味着任何删除方案都会遇到重复匹配区域的问题 。

最简单的修正是用 环绕环( 空体) 进行环绕( 空体), 使用以下事实 : < code> regsub 将返回当给定变量以存储结果时所执行的替代次数( 下面最后一个参数 ) :

set str "mississippi mud pie"
while {[regsub -all {(.)(.*)1+} $str {12} str]} {}
puts $str;          # Prints "misp ude"
问题回答

试试这个 :

regsub -linestop -lineanch或-all {([a-z])1+} $subject {1} result

regsub -linestop -nocase -lineanch或-all {([a-z])1+} $subject {1} result

<强 > 排除

{
(           # Match the regular expression below and capture its match into backreference number 1
   [a-z]       # Match a single character in the range between “a” and “z”
)
1          # Match the same text as most recently matched by capturing group number 1
   +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
}
regsub -all {(.)(?=.*1)} $subject {} result

它使用前视线检查是否有更多字符实例。 如果有, 它会删除字符 。

您总是保留最后一个字符。 没有额外的图书馆, 无法在 TCL 中进行外观操作 。

更多关于外观的更多信息: < a href="http://www.perman- 表达式.info/ looksoury.html" rel=“nofollow” >Regex 辅导 - Lookahead and Lookbehind Zero- Width Asssertions


编辑 : hmm... 似乎是 Tcl 8. 5 中背引用的错误 。 匹配, 但不是 。 它抱怨 Invalid back参考编号 。 在外观前头没有背引用, 我看不到任何解决方案 。

可能是我测试的版本(ideone.com/pFS0_/a>). 我找不到任何其它的 TCL 翻译在线测试版本 。





相关问题
Uncommon regular expressions [closed]

Recently I discovered two amazing regular expression features: ?: and ?!. I was curious of other neat regex features. So maybe you would like to share some tricky regular expressions.

regex to trap img tag, both versions

I need to remove image tags from text, so both versions of the tag: <img src="" ... ></img> <img src="" ... />

C++, Boost regex, replace value function of matched value?

Specifically, I have an array of strings called val, and want to replace all instances of "%{n}%" in the input with val[n]. More generally, I want the replace value to be a function of the match ...

PowerShell -match operator and multiple groups

I have the following log entry that I am processing in PowerShell I m trying to extract all the activity names and durations using the -match operator but I am only getting one match group back. I m ...

Is it possible to negate a regular expression search?

I m building a lexical analysis engine in c#. For the most part it is done and works quite well. One of the features of my lexer is that it allows any user to input their own regular expressions. This ...

regex for four-digit numbers (or "default")

I need a regex for four-digit numbers separated by comma ("default" can also be a value). Examples: 6755 3452,8767,9865,8766,3454 7678,9876 1234,9867,6876,9865 default Note: "default" ...