English 中文(简体)
regex 行为意外
原标题:regex behaving unexpected
  • 时间:2012-05-27 19:27:15
  •  标签:
  • python
  • regex

脚本 :

import re

matches = [ hello ,  hey ,  hi ,  hiya ]

def check_match(string):
    for item in matches:
        if re.search(item, string):
            print  Match found:   + string
        else:
            print  Match not found:   + string

check_match( hey )
check_match( hello there )
check_match( this should not match )
check_match( oh, hiya )

产出:

Match not found: hey
Match found: hey
Match not found: hey
Match not found: hey
Match found: hello there
Match not found: hello there
Match not found: hello there
Match not found: hello there
Match not found: this should not match
Match not found: this should not match
Match found: this should not match
Match not found: this should not match
Match not found: oh, hiya
Match not found: oh, hiya
Match found: oh, hiya
Match found: oh, hiya

有些事情我不明白。 首先,每个字符串在这个输出中被搜索了四次,有些返回两个作为发现匹配,有些返回三个。我不确定导致这种情况的代码有什么问题,但有人能尝试看出什么问题吗?

预期产出如下:

Match found: hey
Match found: hello there
Match not found: this should not match
Match found: oh, hiya
问题回答

它并非行为错误,而是你对 re.search(...) 的误解。

参见输出后的评论:

Match not found: hey                    # because  hello  is not in  hey 
Match found: hey                        # because  hey  is in  hey 
Match not found: hey                    # because  hi  is not in  hey 
Match not found: hey                    # because  hiya  is not in  hey 

Match found: hello there                # because  hello  is in  hello there 
Match not found: hello there            # because  hey  is not in  hello there 
Match not found: hello there            # because  hi  is not in  hello there 
Match not found: hello there            # because  hiya  is not in  hello there 

Match not found: this should not match  # because  hello  is not in  this should not match 
Match not found: this should not match  # because  hey  is not in  this should not match 
Match found: this should not match      # because  hi  is in  this should not match 
Match not found: this should not match  # because  hiya  is not in  this should not match 

Match not found: oh, hiya               # because  hello  is not in  oh, hiya 
Match not found: oh, hiya               # because  hey  is not in  oh, hiya 
Match found: oh, hiya                   # because  hi  is in  oh, hiya 
Match found: oh, hiya                   # because  hiya  is in  oh, hiya 

如果您不想在输入 oh, hiya 时匹配模式 < hihi , 您应该围绕您的模式环绕单词边界 :

hi

它只会导致它匹配由其它字母包围的 hi <%em> not ( well hiya there 与模式 不符,但 wellhi there ) 。

尝试这个 - 它更简洁, 它会显示多重匹配 :

import re

matches = [ hello ,  hey ,  hi ,  hiya ]

def check_match(string):
    results = [item for item in matches if re.search(r %s  % (item), string)]
    print  Found %s  % (results) if len(results) > 0 else "No match found"

check_match( hey )
check_match( hello there )
check_match( this should not match )
check_match( oh, hiya )
check_match( xxxxx xxx )
check_match( hello and hey )

给予 :

Found [ hey ]
Found [ hello ]
No match found
Found [ hiya ]
No match found
Found [ hello ,  hey ]

您为每组获得 4 个搜索和 4 个输出, 因为您正在通过一个阵列循环, 搜索并输出数组中每个元素的某个内容...

循环是检查每个匹配点的字符串, 打印出来是否为每个匹配点找到。 您真正想要的是看到匹配点中是否有匹配点, 然后打印出一个“ 找到” 或“ 未找到 ” 。 我其实不知道 python, 所以语法可能关闭 。

for item in matches:
    if re.search(item, string):
    found = true
if found:
    print  Match found:   + string
else:
    print  Match not found:   + string

`` `





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...