awk example programs


This is a collection of awk example programs. The purpose of this page is to house the awk programs that are slightly too long for the awk one liners page.

Print words that are common to two files

BEGIN{RS=" |\n"}  # change record separator to include spaces
{ a[$0]=1 }
END{
    while((getline<ref)>0){ 
        if ($0 in a) print
    } 
    close(ref);
}
Usage:
 awk -f script.awk a.txt ref=b.txt
Any words that are common to both a.txt and b.txt will be printed, 1 per line. a.txt can be provided on the command line or piped to awk. The above script includes examples of passing variables into an awk script (the 'ref' variable is passed on the command line), reading from a file, and using associative arrays (the indexes into the 'a' array are the words themselves).


Provide a list of words used in a file along with counts

BEGIN{RS=" |\n"}
{ a[$0]++ }
END{ for(i in a) print i": "a[i]; }
Usage:
 awk -f script.awk file.txt
This script will display all of the words that occur in file.txt, with the number of times each word appeared printed along side each word.


Extract links from a web page

BEGIN{
    RS=" |\n|>|<";
    res="/inet/tcp/0/www.google.com/80";
    print "GET / HTTP/1.0\r\n\r\n" |& res;
    while ((res |& getline) > 0)
        if($0 ~ /href[ ]?=/) print
    close(res);
}
    
Usage:
 awk -f script.awk
This script will open the web page at www.google.com, read each line of the reply and only print the parts that have a "href" tag. To make it a little easier, we add > and < characters to our record separator (RS) to further split up the returned page.


Send an email using awk

#!/bin/awk -f 
BEGIN{text="";}
NR==1{subject=$0;}
{text=text$0; }
END{
    if(from == "") from = "who@really.knows";
    print "from: "from;
    if(to == "") {print "ERROR, \"to\" undefined"; exit};
    print "to: "to;
    res="/inet/tcp/0/mail.somewhere.com/25";
    print "HELO whereever.com\r\n" |& res;
    if ((res |& getline) > 0) print;
    print "MAIL FROM: "from"\r\n" |& res;
    if ((res |& getline) > 0) print;
    print "RCPT TO: "to"\r\n" |& res;
    if ((res |& getline) > 0) print;
    print "DATA\r\n" |& res;
    print "Subject:"subject"\r\n" |& res;
    print text"\r\n.\r\n" |& res;
    if ((res |& getline) > 0) print;
    print "QUIT\r\n" |& res;
    if ((res |& getline) > 0) print;
    close(res);
    print "done."
}
Usage:
 echo 'send me' | ./mail.awk to=who@recipient.com
This assumes the awk program above is saved in a file called 'mail.awk', and is executable (chmod u+x mail.awk).

It will send anything piped into it, with the first line piped into it becoming the subject. The subject is then repeated in the text.
This script will open a connection to an SMTP server at mail.somewhere.com, then use it to send an email to who@recipient.com.
The 'from' field can be whatever you like, for information look at RFC822-RFC1123.
For some applications, authentication may be required, something this script does not do (yet).




Choose your way out: [sed2awk project] [awklores home]

Copyright James Lyons - 2007 - No reproduction without permission
dsplabslinux blogtravel blogcryptographymy homepage