Learn Understand first, then practice while the concept is still fresh.

M53 - Troubleshooting Lab: 5 Break Scenarios

Apply structured troubleshooting to realistic OS failures by choosing the first checks, narrowing scope, and proposing a safe next action.

SysAdmin

Troubleshooting Lab: 5 Break Scenarios

Apply structured troubleshooting to realistic OS failures by choosing the first checks, narrowing scope, and proposing a safe next action.

55 min ADVANCED BOTH Curriculum-reviewed
What you should be able to do after this
  • Apply structured troubleshooting to realistic OS failures by choosing the first checks, narrowing scope, and proposing a safe next action.

The Point of This Lab

This lab is not about memorizing one magic command per scenario.

It is about practicing three habits:

  • choose the first checks well
  • narrow the problem before changing things
  • make the next action safe and explainable

For each case, focus on:

  1. what the symptom really tells you
  2. what scope you should test first
  3. which command or observation gives the most useful evidence next

Scenario 1: Name Fails, Network Might Not

Problem: Users cannot reach wiki.corporate.local.

Observed symptom: Browser reports a name-resolution error.

Good first questions:

  • is general network connectivity working?
  • is only the name lookup failing?

Strong first checks:

  • ping 8.8.8.8
  • nslookup wiki.corporate.local or dig wiki.corporate.local

Likely lesson: If raw connectivity works but name lookup fails, the problem is probably DNS rather than the target server itself.


Scenario 2: Connection Refused on One Port

Problem: A database host is reachable, but connections to port 3306 are refused.

Good first questions:

  • is the service listening?
  • is the service running?
  • is the refusal local to the service or caused by a filter in front of it?

Strong first checks:

  • ss -tulpn | grep 3306 on Linux or a Windows equivalent for listening ports
  • systemctl status mysql or Get-Service
  • service logs if the process is stopped or failing

Likely lesson: A refused port often points toward a stopped service or an application not listening, not always a general network outage.


Scenario 3: System Feels Slow, Not Dead

Problem: A web server responds, but it is extremely slow.

Good first questions:

  • is the bottleneck CPU, memory, or disk?
  • is a background job competing with the main workload?

Strong first checks:

  • top or htop
  • disk or I/O observation tools
  • recent backup, compression, or maintenance activity

Likely lesson: Slow response often needs resource inspection before service restarts or config changes.


Scenario 4: The Fix Did Not Survive Reboot

Problem: A change seemed to work yesterday, but after reboot the system is broken again.

Good first questions:

  • was the fix made in a persistent location?
  • does startup overwrite or regenerate that state?
  • is the service reading the same config you changed?

Strong first checks:

  • confirm the service is running
  • inspect the expected configuration file
  • check whether automation, container recreation, or policy management rewrote the change

Likely lesson: Some fixes disappear because they were made in the wrong layer or the wrong place, not because the change itself was bad.


Scenario 5: “No Space Left” Even Though Space Exists

Problem: An application cannot write temporary files, but df -h still shows free space.

Good first questions:

  • is the block space full?
  • is the filesystem out of inodes instead?

Strong first checks:

  • df -h
  • df -i

Likely lesson: Storage problems are not only about gigabytes. Filesystem metadata limits matter too.


How To Use This Lab Well

For each scenario, write down:

  1. the first two checks you would run
  2. what each result would help you confirm or exclude
  3. the safest next action after those checks

If you can explain your sequence clearly, you are building real troubleshooting ability.


What You Just Practiced

  • distinguishing symptoms from causes
  • selecting high-value first checks
  • reducing scope before making changes
  • proposing a next action that tests a hypothesis instead of guessing

This is the right mindset to carry into the intensive CLI and capstone sections that follow.