M42 - Scripting: Files & Text Processing
Scripting: Files & Text Processing
Use redirection, pipes, and a few core text tools to inspect, filter, reshape, and save command output.
- Use redirection, pipes, and a few core text tools to inspect, filter, reshape, and save command output.
Why Text Processing Matters to Scripting
Scripts rarely stop at “run one command.”
Most real scripts need to do at least one of these:
- save output to a file
- filter output down to the lines that matter
- extract one field from a larger result
- pass the result into the next command
That is why redirection and pipelines matter so much. They are the plumbing of automation.
1. Standard Output, Error, and Redirection
Most command-line programs use at least two output streams:
- stdout for normal results
- stderr for errors and warnings
You can redirect those streams:
echo “study session” > notes.txt echo “second line” >> notes.txt script.sh > run.log 2> errors.log script.sh > all-output.log 2>&1
The key distinctions:
>writes a new file or overwrites an existing one>>appends to the end of a file2>sends only errors to a separate file
Use these carefully. Overwriting the wrong file is a common beginner mistake.
2. The Pipe Connects One Command to the Next
A pipe sends the output of one command directly into another.
Get-Process | Sort-Object CPU -Descending | Select-Object -First 5 Name, CPU
cat /etc/passwd | grep “bash” | wc -l
The idea is the same on both platforms:
- produce data
- narrow it
- present or count the result
In PowerShell, you often pass objects. In Bash, you often pass text. But the study habit is the same: break a problem into smaller steps.
3. The Small Tool Roles
You do not need to master every text-processing utility at once. Start with the role of each tool.
grep/Select-String: keep only matching linesawk: pull out columns or fieldssed/-replace: change texttee: save a copy while still sending data onward
Examples:
Get-Content .\app.log | Select-String “ERROR”
(Get-Content .\config.txt) -replace “debug”, “info” | Set-Content .\config-updated.txt
grep “ERROR” app.log awk '{print $1, $3}' requests.log sed ‘s/debug/info/g’ config.txt cat access.log | grep “POST” | tee post-requests.log | wc -l
You can learn these as building blocks instead of treating them as a wall of syntax.
4. A Sensible Order for Building Pipelines
When a command chain gets longer, work in this order:
- run the source command alone
- confirm what the output looks like
- add one filter
- add one reshape step
- only then save or reuse the result
That habit keeps you from writing long pipelines you do not actually understand.
Keep the Pipeline Explainable
If you cannot explain what each stage contributes, the pipeline is too complex for the moment. Break it apart, inspect the intermediate output, and then rebuild it one step at a time.
5. Text Processing Supports Real Troubleshooting
These tools matter because operating systems produce a lot of output:
- log files
- process listings
- file lists
- service state
- command results
Scripting becomes much more useful once you can turn that raw output into a short answer you can read, save, or use in the next step.
What You Just Learned
- Redirection changes where command output and errors go.
- Pipes let one command feed another directly.
grep,awk,sed, andteeeach solve a different text-processing job.- PowerShell usually pipes objects; Linux tools usually pipe text.
- The safest way to build pipelines is one visible step at a time.
Next, you will use these skills to schedule tasks and automate small recurring jobs.