Python regex does not match

I am trying to extract a file name using regex. File names are in the list files
, the pattern to be matched is songTitle
.

files = listdir(curdir)
        print("Pattern : %s" % songTitle)
        for songs in files:
            print(songs)
            re_found = re.match(re.escape(songTitle) + r'.*.mp3$', songs)
            if re_found:
                FileName = re_found.group()
                print(FileName)
                break

In this example

files
contains :

['.DS_Store', '__init__.py', 'command_line.py', "Skrillex & Diplo - 'Mind' feat. Kai (Official Video)-fDrTbLXHKu8.mp3"]

songTitle
(Pattern to be matched) : Skrillex & Diplo - 'Mind' feat. Kai (Official Video)

Output :

Pattern : Jack Ü - Take Ü There feat. Kiesza [OFFICIAL VIDEO]
.DS_Store
__init__.py
command_line.py
Jack Ü - Take Ü There feat. Kiesza [OFFICIAL VIDEO]-C9slkeFXogU.mp3
Skrillex & Diplo - 'Mind' feat. Kai (Official Video)-fDrTbLXHKu8.mp3

EDIT:

I ran some tests and realised that the problem occurs due to non ascii characters. Such as the ‘Ü’ in this case.

The regular expression actually looks fine, but the problem is in your indentation and in the if statement. Try this:

files = listdir(curdir)
print(files)
print("Pattern : %s" %songTitle)
for songs in files:
    re_found = re.match(re.escape(songTitle) + r'.*.mp3$', songs)
    if re_found:
        FileName = re_found.group()
        print(FileName)
        break

Also, when writing regular expression literals, you should generally put an ‘r’ before the literal otherwise you’ll need to escape the backslashes.

Hello, buddy!稿源:Hello, buddy! (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Python regex does not match

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录