LAB-TEXT-05 - Sorting & Uniqueness
TXT Text Processing
Sorting & Uniqueness
Use sort, uniq, and wc to organize text, count repeated values, and rank the most common entries.
25 min BEGINNER LINUX Curriculum-reviewed
Prerequisites
Success criteria
- Use sort, uniq, and wc to organize text, count repeated values, and rank the most common entries.
- Repeat the workflow without copy-paste or step-by-step prompting.
Safety notes
- Practice counting and ranking on sample text first so you can see exactly how each command changes the data.
Part A: The Field Guide
What This Lab Is Really About
Once you can filter and extract text, the next useful step is counting patterns.
This lab teaches a small but powerful sequence:
- organize the data with
sort - count repeated lines with
uniq -c - rank the counts with
sort -nr
Command Reference
sort items.txt sort items.txt | uniq sort items.txt | uniq -c sort items.txt | uniq -c | sort -nr wc -l items.txt
Part B: The Drill Deck
Terminal required: work inside a disposable practice folder.
G Guided Step by step - type exactly this and compare the result >
Exercise G1: Sort Numbers Correctly
- Create a practice folder:
mkdir -p "$HOME/sort_lab"
cd "$HOME/sort_lab"- Create a number list:
printf "1\n10\n2\n25\n3\n" > numbers.txt- Compare:
sort numbers.txt
sort -n numbers.txt- Confirm why numeric sort matters.
Exercise G2: Count Duplicates
- Create a list with repeated values:
printf "apple\nbanana\napple\ncherry\napple\nbanana\n" > fruits.txt- Count duplicates the right way:
sort fruits.txt | uniq -c- Confirm why sorting comes first.
Exercise G3: Rank the Counts
- Add one more sort:
sort fruits.txt | uniq -c | sort -nr- Confirm that the most common value is now at the top.
S Solo Task described, hints available - figure it out >
Exercise S1: Count Total Lines
- Count the total number of fruit entries:
wc -l fruits.txt- Compare that total to the number of unique fruits you saw after
uniq.
Exercise S2: Count Repeated Paths
- Create a small request list:
printf "/\n/login\n/\n/status\n/login\n/\n" > paths.txt- Use a pipeline to count and rank the repeated paths.
M Mission Real scenario - no hints, combine multiple skills >
Mission M1: Find the Top 3 Requested Paths
Create a file named requests.log with this content:
GET /
POST /login
GET /
GET /status
POST /login
GET /
GET /docsNow build a pipeline that:
- extracts only the path column
- sorts the paths
- counts duplicates
- ranks the counts from highest to lowest
- shows only the top three results
If you can explain what each stage contributes, the mission is complete.