Open main menu

CDOT Wiki β

Changes

Tutorial9: Regular Expressions

18 bytes added, 10:14, 13 March 2021
INVESTIGATION 1: SIMPLE & COMPLEX REGULAR EXPRESSIONS
# Issue the following Linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "the$" textfile1.txt</span><br><br>The '''$''' symbol is used to anchor patterns at the <u>end</u> of the string.<br><br>
# Issue the following Linux command to anchor the <u>word</u> "'''the'''"<br>'''simultaneously''' at the <u>beginning</u> and <u>end</u> of the string:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "^the$" textfile1.txt </span><br><br>What do you notice?<br><br>Anchoring patterns at both the <u>beginning</u> and <u>ending</u> of strings can greatly assist<br>for more '''precise''' search pattern matching.<br><br>We will now be demonstrate the '''effectiveness''' of <u>combining</u><br> '''anchors''' with <u>other</u> complex regular expressions symbols.<br><br><table align="right"><tr valign="top"><td>[[Image:regexps-4.png|thumb|right|280px|Anchoring regular expressions using '''period''' symbols at the '''beginning''' of text.]]</td><td>[[Image:regexps-5.png|thumb|right|250px|Anchoring regular expressions using '''period''' symbols simultaneously at the '''beginning''' and '''ending''' of text.]]</td></tr></table>
# Issue the following Linux command to match strings that '''begin with 3 characters''':<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^..." textfile1.txt</span><br><br>What do you notice? Can lines that contain '''less than 3 characters''' be displayed?<br><br>
# Issue the following Linux command to match strings that begin <u>and</u> end with 3 characters:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^...$" textfile1.txt</span><br><br>What do you notice compared to the previous command?<br><br>
# Issue the following Linux command to match strings that '''begin with 3 digits''':<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^[0-9][0-9][0-9]" textfile1.txt</span><br><br>What did you notice?<br><br># Issue the following Linux command to match strings that '''end with 3 uppercase letters''':<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "[A-Z][A-Z][A-Z]$" textfile1.txt</span><br><br><table align="right"><tr valign="top"><td>[[Image:regexps-6.png|thumb|right|220px|Anchoring '''3 digits''' at the '''beginning''' and '''ending''' of text.]]</td><td>[[Image:regexps-7.png|thumb|right|250px|Anchoring '''3 alpha-numeric characters''' at the '''beginning''' and '''ending''' of text.]]</td></tr></table>Did any lines match this pattern?<br><br>
# Issue the following Linux command to match strings that consist of only 3 digits:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^[0-9][0-9][0-9]$" textfile1.txt</span><br><br>What did you notice?<br><br>
# Issue the following Linux command to match strings that consist of only 3 alphanumeric digits:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]$" textfile1.txt</span><br><br>What did you notice?<br><br>The <span style="font-weight:bold;font-family:courier;">"*"</span> complex regular expression symbol is often confused with the "*" '''filename expansion''' symbol.<br>In other words, it does NOT represent zero or more of '''any character''', but zero or more '''occurrences'''<br>of the character that comes '''before''' the <span style="font-weight:bold;font-family:courier;">"*"</span> symbol.<br><br>
13,420
edits