Python Regex


Here are some code chunks for the normal regex cases in Python.

When using Regex in python you can either use the regular expression as an argument to the operation. Or compile an existing expression. I'll do the compiled method on this page.

Python regex strings is a special type of string. To express a regex string, add the letter r before the string. Like the example below.

Compile a regular expression:

import re
__PDF = re.compile(r'\<.*(?:value)=\"(.*?\.pdf)\"\>(.*?)\<')

This regular expression will find HTML links to .pdf files on this page: Reference Guides - The Khronos Group Inc.

Single match

match = __PDF.match('<option value="gltf20-reference-guide.pdf">glTF 2.0</option>')
match.group(0) # '<option value="gltf20-reference-guide.pdf">glTF 2.0<'
match.group(1) # 'gltf20-reference-guide.pdf'
match.group(2) # 'glTF 2.0'

Note group 0 is the entire match.

If there is no match, match will be None.

Iterate multiple matches

for match in __PDF.finditer(content):
    print('Found a pdf link: {}'.format(match.group(1)))

Output:

Found a pdf link: collada_reference_card_1_4.pdf
Found a pdf link: gltf20-reference-guide.pdf
Found a pdf link: egl-1-4-quick-reference-card.pdf
Found a pdf link: opencl30-reference-guide.pdf
Found a pdf link: opencl22-reference-guide.pdf
Found a pdf link: opencl21-reference-guide.pdf
Found a pdf link: opencl20-quick-reference-card.pdf
Found a pdf link: opencl-1-2-quick-reference-card.pdf
...