Jump to content


Photo

HPR - HPR1501: AWK


  • Please log in to reply
No replies to this topic

#1 BINREV SPYD3R

BINREV SPYD3R

    Live to Hack...Hack to Live.

  • Members
  • 2,498 posts

Posted 04 May 2014 - 07:00 PM

First of all, a correction. In the podcast, I mistakenly refer to one of thecoauthors of the language as Kevin Weinberger. My humblest apologies to Mr.Weinberger, whose actual first name is Peter. I also neglected to mention oneof AWK's most interesting features: its automatic field splitting. I hope tosubmit a followup podcast soon in order to rectify these two glaring mistakes.AWK is a loosely typed interpreted programming language. Many useful functionsin a UNIX programming environment, such as reading files, looping over input,matching regular expressions, and splitting strings into fields have beenabstracted and are presented to the programmer as native parts of the language.This makes AWK ideal for text processing.The basic structure of an AWK program is a list of rules. Each rule is made upof an optional pattern and an optional action. If the pattern is matched, thecorresponding action is run. When AWK starts up, it loads the supplied programtext, runs any rules with the special BEGIN pattern, then in turn, opens eachfile supplied on the command line (or stdin if no files or a - are specified).Each file is split into records based on the value in the RS (record separator)variable. AWK then loops through each record, splits it into fields based onthe value in the FS (field separator) variable, and loops through each rule inthe program. An empty pattern matches all records, so actions with no patternrun for every record. An empty action causes the current record to be printed.The operator most unique to AWK is the $ (field access) operator. When followedby an integer literal or variable holding an integer value, it returns thecorresponding field in the current record (counting from 1 up to NF, the numberof fields special variable). $0 returns the entire record. If the suppliedinteger is greater than NF, it is treated as an uninitialized variable, which,in AWK, is treated dually as either the empty string, or the number 0,depending on the context in which it is referenced.The most common type of pattern used in AWK (excepting, perhaps, the emptypattern) is a regular expression literal. It consists of a regular expressionenclosed in forward slashes. This syntax is inherited from ed, the standardtext editor, and has been passed down all the way to javascript. In AWK, aregular expression literal, alone as a pattern, is shorthand for $0 ~ /regex/,where ~ is the regular expression match operator (the string $0, currentrecord, matches the supplied regular expression).POSIX AWK: http://pubs.opengrou...lities/awk.htmlThe AWK Programming Language: http://books.cat-v.o...ng_Language.pdf

Go to this episode




BinRev is hosted by the great people at Lunarpages!