Changes

Jump to: navigation, search

Tutorial11: Sed & Awk Utilities

1,064 bytes added, 12:19, 26 November 2020
INVESTIGATION 2: USING THE AWK UTILITY
:* Use the '''sed''' command to '''manipulate text''' contained in a file.
:* List and understand explain several '''instructions''' associated with the '''sed''' command.
:* Use the '''sed''' command as a '''filter''' with Linux pipeline commands.
:* Use the '''awk''' command to '''manipulate text''' contained in a file.
 
:* List and explain several '''comparison operators''' and variables associated with the '''awk''' command.
:* Use the '''awk''' command as a '''filter''' with Linux pipeline commands.
<br><br>
 
===Tutorial Reference Material===
| style="padding-left:15px;" |Text Manipulation
* [https://www.digitalocean.com/community/tutorials/the-basics-of-using-the-sed-stream-editor-to-manipulate-text-in-linux Purpose of using the sed utility]* [https://www.digitalocean.com/community/tutorials/how-to-use-the-awk-language-to-manipulate-text-in-linux Purpose of using the awk utility]
| style="padding-left:15px;" |Man Pages
'''NoteNotes:'''
*The sed command reads '''all lines in the input file''' and will be exposed to the expression (contained in quotes).
'''Usage:'''
<span style="color:blue;font-weight:bold;font-family:courier;">awk options 'selection _criteria -criteria {action }’ file-name</span>
'''NoteNotes:'''
*The awk command reads '''all lines in the input file''' and will be exposed to the expression (contained within quotes) for processing.
*If no pattern is specified, awk selects all lines in the input
*If no action is specified, awk copies the selected lines to standard output
*You can use parameters like '''$1''', '''$2''' to represent first field, second field, etc.
*You can use the '''-F''' option with the awk command to specify the field delimiter.
*You can use a '''regular expression''', enclosed within slashes, as a pattern.
*The ''''''~'''''' operator tests whether a field or variable matches a regular expression *The ''''''!~''''''Bold text'''''' operator tests for no match.
*You can perform both numeric and string comparisons using relational operators
*You can combine any of the patterns using the Boolean operators '''||''' (OR) and '''&&''' (AND)
'''Examples:'''
<span style="font-family:courier;">'''xawk 'NR == 3 {print}' text.txt''' (xprint 3rd line)<br>'''xawk ''' (x)NR >= 1 && NR <br>= 5 {print}'''xtext.txt''' (xprint lines 1 to 5)<br>'''xawk '/NOTE:/ {print]' text.txt''' (xprint lines that contain the pattern: "NOTE:")<br><br>'''xawk -F";" '$1 ~ /ford/ {print}'' (x)<br>'''xcars.dat''' (xprint records (of semi-colon delimited file) whose 1st field matches: "ford")<br>'''x''' (x)<br>'awk -F";" '$1 ~ /ford/ {print $2,$4}'xcars.dat''' (xsame as above, but only print 2nd and 4th fields)<br><br></span>
=INVESTIGATION 1: USING THE SED UTILITY=
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ict.senecacollege.ca/~murray.saul/uli101/data.txt</nowiki></span><br><br>
# Issue the '''more''' command to quickly view the contents of the '''data.txt''' file.<br>When finished, exit the more command by pressing the letter <span style="color:blue;font-weight:bold;font-family:courier;">q</span><br><br>
# The '''p''' command in sed is used to print or display the contents of a text file.<br>Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed 'p' data.txt</span><br><br>You should notice that each line appears '''twice'''. The reason why standard output appears twice is that the sed command (without the '''-n option''') displays all lines regardless if they had been specified as a pattern.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n 'p' data.txt</span><br><br>What do you notice?<br><br>You can specify an address (line #, line #s or range of line #s) when using the sed utility.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '1 p' data.txt</span><br><br>You should see the first line of the text file displayed.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '2,5 p' data.txt</span><br><br>What do you think is displayed? (in another SSH session, compare with contents in the data.txt text file How do you change command to display lines 2 to confirm).5?<br><br>The '''s''' command is used to substitute patterns (similar to method demonstratedin vi editor).<br><br># Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed '2,5 s/TUTORIAL/LESSON/g' data.txt</span><br><br>What do you notice? View the original contents of lines 2 to 5 in the '''data.txt''' file in another shell to confirm that the substitution occurred.<br><br>The '''q''' command terminates or '''quits ''' the execution of the sed utility as soon as it read in a particular line or matching pattern.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed '11 q' data.txt</span><br><br>What did you notice?<br><br>You can use regular expressions to select lines that match a pattern. The rules remain the same for using regular expressions as demonstrated in lab8 except the regular expression must be contained within delimiters such as the forward slash "/" when using the sed utility.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '/^The/ p' data.txt</span><br><br>What do you notice?<br><br>
# Issue the following linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">who | sed -n '/^[a-m]/ p' | more</span><br><br>What did you notice?<br><br>
:In the next investigation, you will learn how to manipulate text using the '''awk ''' utility.<br><br>
=INVESTIGATION 2: USING THE AWK UTILITY =
# Issue a command to '''confirm''' you are located in your home directory.<br><br>
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ict.senecacollege.ca/~murray.saul/uli101/cars.txt</nowiki></span><br><br>
# Issue the '''more''' command to quickly view the contents of the '''cars.txt''' file.<br>When finished, exit the more command by pressing the letter <span style="color:blue;font-weight:bold;font-family:courier;">q</span><br><br>The "'''print'''" action (command) is the <u>default </u> action of awk to print all selected lines that match a pattern.<br>This action (contained in braces) can provide more options such as printing specific fields of selected lines (or records) from a database.<br><br># Issue the following linux command all to display records in the "cars.txt" database that contain the make "ford":<br><span style="color:blue;font-weight:bold;font-family:courier;">wget awk '/ford/ {print}' cars.txt</span><br><br># Issue the following linux command all to display records in the "cars.txt" database that contain the make "ford":<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/ford/' cars.txt</span><br><br>What do you notice?<br><br>You can use variables with the "print" action for further processing. We will discuss the following variables in this tutorial:<br><br>'''$0''' - Current record (entire line)<br>'''$1''' - First field in record<br>'''$n''' - nth field in record<br>'''NR''' - Record Number (order in database)<br> '''NF''' - Number of fields in current record<br><br>For a listing of more variables, please consult your course notes.<br><br>The '''tilde character''' '''~''' character is used to search for a pattern or display standard output for a particular field.<br><br>
# Issue the following linux command to display the model, year, quantity and price in the "cars.txt" database for makes of "chevy":<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/chevy/ {print $2,$3,$4,$5}' cars.txt</span><br><br>Notice that a space " " is the delimiter for the fields that appear as standard output.<br><br>
# Issue the following linux command to display all plymouths (plyms) by model name, price and quantity:<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/chevy/ {print $2,$3,$4,$5}' cars.txt</span><br><br>You can also use comparison operators to specify conditions for processing with matched patterns when using the awk command. Since they are used WITHIN the awk expression, they are not confused with redirection symbols<br><br> Comparison Operators:<br><br>'''<''' &nbsp;&nbsp;&nbsp;&nbsp;Less than<br>'''<=''' &nbsp;&nbsp;Less than or equal<br>'''>''' &nbsp;&nbsp;&nbsp;&nbsp;Greater than<br>'''>=''' &nbsp;&nbsp;Greater than or equal<br>'''==''' &nbsp;&nbsp;Equal<br>'''!=''' &nbsp;&nbsp;&nbsp;Not equal<br><br>
# <span style="font-family:courier;font-weight:bold">sed -n '3,6 p' ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">sed '4 q' ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">sed '/the/ d' ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">sed 's/line/NUMBER/g' ~murray.saul/uli101/stuff.txt</span>
# <span style="font-family:courier;font-weight:bold">awk ‘NR == 3 {print}’ ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">awk ‘NR >= 2 && NR <= 5 {print}’ ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">awk ‘$1 ~ /This/ {print $2}’ ~murray.saul/uli101/stuff.txt</span><br><br># <span style="font-family:courier;font-weight:bold">awk ‘$1 ~ /This/ {print $3,$2}’ ~murray.saul/uli101/stuff.txt</span><br><br>
13,420
edits

Navigation menu