Post #014 — Network Infrastructure

OPNsense Back Online.

What Failed, Why It Was Harder Than It Should Have Been, and What I Have Now That I Didn't Before

The firewall is back. The network is under control. And the actual culprit turned out to be something embarrassingly simple — which is exactly why it took so long to find.

This post documents what happened during the failed first deployment, why isolating the real problem was harder than it sounds, and what the network looks like now that OPNsense is running properly.

Why This Was Hard to Diagnose.

THE CONTEXT

Troubleshooting a network while people are actively using it is a completely different kind of challenge. Every change you make has immediate consequences. Every minute without internet is someone's phone not loading, a payment terminal going offline, a business interruption.

This property runs two active businesses across approximately 1,250 m². During working hours, the network isn't an experiment — it's infrastructure. That means real changes, failovers, and diagnostics have to happen under pressure, quickly, with no margin to methodically isolate one variable at a time.

What looked like a complex firewall misconfiguration turned out to be three separate, unrelated problems that happened to surface at the same time — making the behavior appear random when it wasn't.

Three Problems. One Chaotic Result.

WHAT ACTUALLY FAILED

ISP Hardware

Two of the four LAN ports on the ISP equipment were not functioning correctly — likely dirty contacts from years of disuse. Moving to a working port fixed the instability immediately.

Faulty RJ45

A weak electrical connection on one of the terminations. Not visually obvious. Causing intermittent drops that looked like configuration failures.

DHCP Conflict

Multiple devices competing for IP assignments during the transition. Not a hardware failure — a logical conflict that cleared once the other two issues were resolved.

None of these would have been catastrophic on their own. Together, layered on top of each other, they produced behavior that appeared completely random. That's the part that cost time — not the fix, but the diagnosis.

OPNsense wasn't the problem. It never was.

4 AM.

THE PART NOBODY SEES

There's only one window where this kind of work is possible: before anyone wakes up.

I set my alarm for 4 AM. Not because I had to — because it was the only time the network belonged entirely to me. No businesses open. No phones ringing. No one waiting for a page to load or a payment to process. Just silence, a terminal, and enough time to do the work properly.

I moved cables. I isolated variables. I replaced a termination. I verified every connection before committing to the next one. I did the deploy cleanly, without pressure, without anyone watching.

By 6 AM, everything was behind t
he firewall.

Nobody noticed. There were no calls, no complaints, no questions. People arrived, opened their devices, and used the network exactly the same way they always had. The day ran normally. The businesses ran normally.

That's what a successful infrastructure change looks like from the outside: nothing. Total invisibility. The work disappears into the background and life continues without interruption.

That's also the part that never shows up in documentation. The alarm at 4 AM. The decision to sacrifice sleep in order to have the calm conditions the work actually requires. The two hours of focused, uninterrupted effort that made the next eight hours completely uneventful for everyone else.

I'm starting to understand that this is what real infrastructure work looks like. Not the deploy — the conditions you create to do it right.

The Part I Didn't Expect.

THE REAL LESSON

I checked hardware I assumed was fine. I cleaned dust out of equipment that had been running ignored for years. I checked contacts, verified cabling, confirmed power. All of this before touching a single configuration file in OPNsense.

The assumption was that the new system broke what was already working. That's the instinct — blame the change. But the network wasn't as healthy as it appeared. It was stable enough that nobody noticed the weaknesses underneath. The moment I started making structural changes, those weaknesses surfaced all at once.

The new system doesn't always create the problem. Sometimes it reveals the problem that was already there.

In the end, what failed was a terminal. A weak electrical connection. Possibly something as simple as dirty contacts inside a port that hadn't been used in years. Tiny things. The kind of things nobody thinks about while everything seems to be running smoothly.

What the Network Looks Like Now.

CURRENT STATE

ISP Equipment → OPNsense (OptiPlex) → Switch
↓                                             ↓
All 30+ devices NUC · NAS · IMPERFECT

OPNsense sits between the ISP equipment and the rest of the network. Everything passes through it. DHCP is managed internally — every device that matters has a static mapping by MAC address. No more guessing IPs. No more blind reboots.

The Backup Plan.

WHAT DIDN'T EXIST BEFORE

Before this deployment, there was no documented recovery procedure. If the network went down, the path forward was improvised each time. That's gone now.

Emergency Network Recovery — Estimated Time: 15–20 min

~2 min

Disconnect OPNsense from the network. Plug a device directly into the ISP equipment.

~3 min

Verify internet access is restored. Confirm the ISP port is functional.

~5 min

Reconnect devices directly to switch. Network is operational at ISP level without firewall.

~10 min

Diagnose OPNsense separately, without pressure, without anyone waiting for internet access.

That plan didn't exist before. Having it changes the stakes entirely. The firewall going down is no longer an emergency — it's a maintenance window.

What's Next.

HONEST STATUS

The network is stable. The firewall is running. The NUC has a permanent IP. The NAS has a permanent IP. IMPERFECT has a permanent IP.

The next phase is Home Assistant — a dedicated machine for automation, smart plugs, NFC access, and door sensors. That project requires its own post. For now, the infrastructure foundation is finally in place.

Working systems still need maintenance. Especially before adding something new to the mix. Because stability can be misleading — sometimes things aren't healthy, they're just getting by quietly until something changes around them.

That realization is worth more than the deployment itself.

My Network Has No Firewall Right Now. Here's Why That's Temporary. — Post #008
Got the Firewall Running Again. — Post #011
A 2014 Mini PC. Two Services. Zero Monthly Fees. — Post #013

This post contains Amazon affiliate links. If you purchase through them, I may earn a small commission at no extra cost to you. Every product listed here is something I personally own and use.

Search This Blog

Creatively Different Builds