LAB-TEXT-03 - Column Parsing with awk
TXT Text Processing
Column Parsing with awk
Use awk to extract columns, change delimiters, and format structured text into clearer output.
35 min INTERMEDIATE LINUX Curriculum-reviewed
Prerequisites
Success criteria
- Use awk to extract columns, change delimiters, and format structured text into clearer output.
- Repeat the workflow without copy-paste or step-by-step prompting.
Safety notes
- Build and test awk commands on sample data first so you can clearly see which field positions your command depends on.
Part A: The Field Guide
What This Lab Is Really About
grep is good at keeping the right lines.
awk helps when the right line still contains too much information and you need only one or two fields from it.
In this lab, you will use awk to:
- print specific columns
- work with a custom delimiter
- format output into something easier to read
Command Reference
awk '{print $1, $3}' requests.log awk -F ’:’ '{print $1}' users.txt awk '{print $NF}' files.txt
Part B: The Drill Deck
Terminal required: work inside a disposable text-practice folder.
G Guided Step by step - type exactly this and compare the result >
Exercise G1: Extract Columns from Space-Separated Data
- Create a sample file:
mkdir -p "$HOME/awk_lab"
cd "$HOME/awk_lab"
printf "10.0.0.5 GET /status\n10.0.0.8 POST /login\n" > requests.log- Print the first and third columns:
awk {"'{print $1, $3}'"} requests.log- Confirm that you extracted the IP and path only.
Exercise G2: Use a Different Delimiter
- Create a colon-separated file:
printf "alice:/bin/bash\nservice:/usr/sbin/nologin\n" > users.txt- Print only the usernames:
awk -F ':' {"'{print $1}'"} users.txt- Print only the shell field:
awk -F ':' {"'{print $2}'"} users.txtExercise G3: Use the Last Field
- Create one more sample:
printf "note one final\nlog two result\n" > fields.txt- Print only the last field from each line:
awk {"'{print $NF}'"} fields.txt S Solo Task described, hints available - figure it out >
Exercise S1: Add Labels to the Output
- Reuse
requests.log. - Format the output into short sentences:
awk {"'{print \"IP\", $1, \"requested\", $3}'"} requests.log- Confirm that awk can mix literal text with fields.
Exercise S2: Filter and Parse Together
- Add one more line:
echo "10.0.0.9 GET /health" >> requests.log- Use grep and awk together:
grep "GET" requests.log | awk {"'{print $1, $3}'"}- Confirm that grep narrowed the rows and awk pulled the fields you wanted.
M Mission Real scenario - no hints, combine multiple skills >
Mission M1: Clean IP Addresses from a Connection List
Create a file named connections.txt with this content:
192.168.1.10:443 ESTABLISHED
10.0.0.8:22 LISTEN
172.16.5.9:8080 ESTABLISHEDNow build a pipeline with two awk stages:
- the first stage should print only the first field
- the second stage should use
:as the delimiter and print only the IP address
Your final output should be a clean list of IPs without port numbers.