python 正則表達式-匹配規則

時間 2019-12-07 標籤 python 正則表達式匹配規則

正則表達式是一個特殊的字符序列，它能幫助你方便的檢查一個字符串是否與某種模式匹配。
Python 自1.5版本起增長了re 模塊，它提供 Perl 風格的正則表達式模式。
re 模塊使 Python 語言擁有所有的正則表達式功能。
compile 函數根據一個模式字符串和可選的標誌參數生成一個正則表達式對象。該對象擁有一系列方法用於正則表達式匹配和替換。
re 模塊也提供了與這些方法功能徹底一致的函數，這些函數使用一個模式字符串作爲它們的第一個參數。python

re.match 嘗試從字符串的起始位置匹配一個模式，若是不是起始位置匹配成功的話，match()就返回none。linux

re.search 掃描整個字符串並返回第一個成功的匹配c++

正則表達式模式：
模式字符串使用特殊的語法來表示一個正則表達式：
字母和數字表示他們自身。一個正則表達式模式中的字母和數字匹配一樣的字符串。
多數字母和數字前加一個反斜槓時會擁有不一樣的含義。
標點符號只有被轉義時才匹配自身，不然它們表示特殊的含義。
反斜槓自己須要使用反斜槓轉義。
因爲正則表達式一般都包含反斜槓，因此你最好使用原始字符串來表示它們。模式元素(如 r’\t’，等價於 ‘\t’)匹配相應的特殊字符。
下表列出了正則表達式模式語法中的特殊元素。若是你使用模式的同時提供了可選的標誌參數，某些模式元素的含義會改變。web

import re
result=re.search(r'wuyanzu','cn.wuyanzu') # search 會在裏面尋找，並返回
a=result.group()
print(a)
result=re.match(r'wuyanzu','wuyanzu.cn') # match 若是開始沒有則沒有
b=result.group()
print(b)
result=re.match(r'.','M') # match '.'匹配任意字符
c=result.group()
print(c)
result=re.match(r't.o','too') # match '.'匹配任意字符
d=result.group()
print(d)
result=re.match(r'[x]','xs') # match '[]'匹配[]中列舉的字符
e=result.group()
print(e)
result=re.match(r'\d','92bkbijk') # match '\d'匹配數字 即0-9
f=result.group()
print(f)
result=re.match(r'\D','M6151nmbhj') # match '\D'匹配非數字
g=result.group()
print(g)
result=re.match(r'\s',' M') # match '\s'匹配空白，即空格，tab鍵
h=result.group()
print(h)
result=re.match(r'\S','11 M') # match '\S'匹配非空白
i=result.group()
print(i)
result=re.match(r'\w','xM') # match '\w'匹配單詞字符，即a-z，A-Z，0-9，_
j=result.group()
print(j)
result=re.match(r'\W',' M') # match '\W'匹配非單詞字符
k=result.group()
print(k)
result=re.match(r'[A-Z][a-z]*','M') # match '*'匹配前一個字符出現0次或者無限次，便可有可無\d*等價於\d{0，}
l=result.group()
print(l)
result=re.match(r'[A-Z][a-z]*','MnnM') # match '*'匹配前一個字符出現0次或者無限次，便可有可無\d*等價於\d{0，}
m=result.group()
print(m)
result=re.match(r'[A-Z]+[a-z]','MnnM') # match '+'匹配前一個字符出現1次或者無限次，即至少有1次\d+等價於\d{1，}
m1=result.group()
print(m1)
result=re.match(r'[0-9]?[1-9]','33') # match '？'匹配前一個字符出現1次或者0次，即要麼有1次，要麼沒有\d？等價於\d{0,1}
n=result.group()
print(n)
result=re.match(r'[0-9]?[^1-9]','00') # match '^'匹配[^]外的任何符號
o=result.group()
print(o)
result=re.match(r'[1-9]?[0-9]','09') # match '？'匹配前一個字符出現1次或者0次，即要麼有1次，要麼沒有\d？等價於\d{0,1}
p=result.group()
print(p)
result=re.match(r'[a-zA-Z0-9_]{8,20}','12345sdwdfac') # match '{m,n}'匹配前一個字符出現m到n次
q=result.group()
print(q)
result=re.match(r'[a-zA-Z0-9_]{6}','12345sdwdfac') # match '{m}'匹配前一個字符出現m次
r=result.group()
print(r)
result=re.match(r'[\w]{4,20}@163.com','272263915@163.com') # match '{m,n}'匹配前一個字符出現m到n次
s=result.group()
print(s)
result=re.findall(r'\d+','python=9999,c=8888,c++=12345') # findall 匹配數字
t=result
print(t)
result=re.sub(r'\d+','998','c=8888') # sub 替換數字
u=result
print(u)
result=re.split(r'\:','info:xiaoming')   # split 分割 linux上是':|'做爲分割符
v=result
print(v)
xxx=re.match('^4.*[369]$','46516516')
sss=xxx.group()
print(sss)
www=re.match('\.\*','.*').group()
print(www)
mylist=['apple','banana','pen','orange']
for i in mylist:
    match_obj=re.match('apple|pen',i)
    if match_obj:
        print(match_obj.group())
    else:
        print('沒有')

mail=['aaa@163.com','bbb@126.com','ccc@qq.com']
for i in mail:
    match_obj=re.match('\w{3,20}@(163|126)\.com$',i)
    if match_obj:
        print(match_obj.group())
        print(match_obj.group(1))

    else:
        print('沒有')

tel=re.match('(0[1-9][0-9]{1,2})-?(\d{6,8})','010-123456').group()
print(tel)