Difference between revisions of "Tutorial 11 - SED & AWK"

From CDOT Wiki
Jump to: navigation, search
(USING SED & AWK UTILTIES)
Line 40: Line 40:
  
 
|}
 
|}
 +
 +
= KEY CONCEPTS =
 +
 +
 +
===Using the sed Utility===
 +
 +
 +
'''Usage:'''
 +
 +
'''<span style="color:blue;font-weight:bold;font-family:courier;">Syntax:  sed [-n] 'address instruction' filename</span>'''
 +
 +
 +
'''How it Works:'''
 +
 +
* The sed command reads all lines in the input file and will be exposed to the expression<br>(i.e. area contained within quotes) one line at a time.
 +
* The expression can be within single quotes or double quotes.
 +
* The expression contains an address (match condition) and an instruction (operation).
 +
* If the line matches the address, then it will perform the instruction.
 +
* Lines will display be default unless the '''–n''' option is used to suppress default display
 +
<br>
 +
'''Address:'''
 +
 +
* Can use a line number, to select a specific line (for example: '''5''')
 +
* Can specify a range of line numbers (for example: '''5,7''')
 +
* Regular expressions are contained within forward slashes (e.g. /regular-expression/)
 +
* Can specify a regular expression to select all lines that match a pattern  (e.g '''/^[0-9].*[0-9]$/''')
 +
* If NO address is present, the instruction will apply to ALL lines
 +
 +
 +
[[Image:sed.png|right|500px|]]
 +
'''Instruction:'''
 +
*'''Action''' to take for matched line(s)
 +
*Refer to table on right-side for list of some<br>'''common instructions''' and their purpose
 +
<br><br>
 +
 +
===Using the awk Utility===
 +
 +
 +
'''Usage:'''
 +
 +
<span style="color:blue;font-weight:bold;font-family:courier;">awk [-F] 'selection-criteria {action}’ file-name</span>
 +
 +
 +
'''How It Works:'''
 +
 +
* The '''awk''' command reads all lines in the input file and will be exposed to the expression (contained within quotes) for processing.
 +
*The '''expression''' (contained in quotes) represents '''selection criteria''',  and '''action''' to execute contained within braces '''{}'''
 +
* if selection criteria is matched, then action (between braces) is executed.
 +
* The '''–F''' option can be used to specify the default '''field delimiter''' (separator) character<br>eg. '''awk –F”;”'''  (would indicate a semi-colon delimited input file).
 +
<br>
 +
'''Selection Criteria'''
 +
 +
* You can use a regular expression, enclosed within slashes, as a pattern. For example: '''/pattern/'''
 +
* The ~ operator tests whether a field or variable matches a regular expression. For example:  '''$1 ~ /^[0-9]/'''
 +
* The '''!~''' operator tests for no match. For example: '''$2 !~ /line/'''
 +
* You can perform both numeric and string comparisons using relational operators ( '''>''' , '''>=''' , '''<''' , '''<=''' , '''==''' , '''!=''' ).
 +
* You can combine any of the patterns using the Boolean operators '''||''' (OR) and '''&&''' (AND).
 +
* You can use built-in variables (like NR or "record number" representing line number) with comparison operators.<br>For example: '''NR >=1 && NR <= 5'''
 +
<br>
 +
'''Action (execution):'''
 +
 +
* Action to be executed is contained within braces '''{}'''
 +
* The '''print''' command can be used to display text (fields).
 +
* You can use parameters which represent fields within records (lines) within the expression of the awk utility.
 +
* The parameter '''$0''' represents all of the fields contained in the record (line).
 +
* The parameters '''$1''', '''$2''', '''$3''' … '''$9''' represent the first, second and third  to the 9th fields contained within the record.
 +
* Parameters greater than nine requires the value of the parameter to be placed within braces (for example:  '''${10}''','''${11}''','''${12}''', etc.)
 +
* You can use built-in '''variables''' (such as '''NR''' or "record number" representing line number)<br>eg. '''{print NR,$0}'''  (will print record number, then entire record).
  
 
= INVESTIGATION 1: USING THE SED UTILITY =
 
= INVESTIGATION 1: USING THE SED UTILITY =

Revision as of 13:53, 14 November 2021

Content under development

USING SED & AWK UTILTIES

Main Objectives of this Practice Tutorial

  • Use the sed command to manipulate text contained in a file.
  • List and explain several addresses and instructions associated with the sed command.
  • Use the sed command as a filter with Linux pipeline commands.
  • Use the awk command to manipulate text contained in a file.
  • List and explain comparison operators, variables and actions associated with the awk command.
  • Use the awk command as a filter with Linux pipeline commands.

Tutorial Reference Material

Course Notes
Linux Command/Shortcut Reference
Course Notes:


Text Manipulation Commands


KEY CONCEPTS

Using the sed Utility

Usage:

Syntax: sed [-n] 'address instruction' filename


How it Works:

  • The sed command reads all lines in the input file and will be exposed to the expression
    (i.e. area contained within quotes) one line at a time.
  • The expression can be within single quotes or double quotes.
  • The expression contains an address (match condition) and an instruction (operation).
  • If the line matches the address, then it will perform the instruction.
  • Lines will display be default unless the –n option is used to suppress default display


Address:

  • Can use a line number, to select a specific line (for example: 5)
  • Can specify a range of line numbers (for example: 5,7)
  • Regular expressions are contained within forward slashes (e.g. /regular-expression/)
  • Can specify a regular expression to select all lines that match a pattern (e.g /^[0-9].*[0-9]$/)
  • If NO address is present, the instruction will apply to ALL lines


Sed.png

Instruction:

  • Action to take for matched line(s)
  • Refer to table on right-side for list of some
    common instructions and their purpose



Using the awk Utility

Usage:

awk [-F] 'selection-criteria {action}’ file-name


How It Works:

  • The awk command reads all lines in the input file and will be exposed to the expression (contained within quotes) for processing.
  • The expression (contained in quotes) represents selection criteria, and action to execute contained within braces {}
  • if selection criteria is matched, then action (between braces) is executed.
  • The –F option can be used to specify the default field delimiter (separator) character
    eg. awk –F”;” (would indicate a semi-colon delimited input file).


Selection Criteria

  • You can use a regular expression, enclosed within slashes, as a pattern. For example: /pattern/
  • The ~ operator tests whether a field or variable matches a regular expression. For example: $1 ~ /^[0-9]/
  • The !~ operator tests for no match. For example: $2 !~ /line/
  • You can perform both numeric and string comparisons using relational operators ( > , >= , < , <= , == , != ).
  • You can combine any of the patterns using the Boolean operators || (OR) and && (AND).
  • You can use built-in variables (like NR or "record number" representing line number) with comparison operators.
    For example: NR >=1 && NR <= 5


Action (execution):

  • Action to be executed is contained within braces {}
  • The print command can be used to display text (fields).
  • You can use parameters which represent fields within records (lines) within the expression of the awk utility.
  • The parameter $0 represents all of the fields contained in the record (line).
  • The parameters $1, $2, $3$9 represent the first, second and third to the 9th fields contained within the record.
  • Parameters greater than nine requires the value of the parameter to be placed within braces (for example: ${10},${11},${12}, etc.)
  • You can use built-in variables (such as NR or "record number" representing line number)
    eg. {print NR,$0} (will print record number, then entire record).

INVESTIGATION 1: USING THE SED UTILITY

INVESTIGATION 2: USING THE AWK UTILITY

LINUX PRACTICE QUESTIONS