Platform EngineeringDevOpsDX May 2025 8 min read

When Internal Tooling Becomes a Barrier: A DevOps Mindset Problem

A simple DNS change turned into a multi-hour process. This isn't just a tooling issue — it's a mindset issue. How we build internal tools matters as much as how we build external products.

The Reddit comment that started this

I came across a comment describing a team's process for making a simple DNS change: commit to GitHub, manually copy the PR number into Jenkins, manually trigger an apply pipeline, use custom CLI tooling for approvals, deal with undocumented linting formats. What should have been a 5-minute task had become a multi-hour ordeal — especially painful for non-developer engineers who just needed to update a record.

This isn't an edge case. I've seen versions of this at multiple organisations. And every time, the people who built the tooling genuinely believed they were making things better.

The real problem: we optimise for builders, not users

When a platform team builds internal tooling, they're usually solving for correctness and control — making sure changes are reviewed, auditable, and safe. These are valid goals. But somewhere in the process, the user experience of the person making the change gets deprioritised until it's an afterthought.

The result is tooling that's technically sound but practically unusable. Engineers route around it. They find the backdoor, the manual override, the "quick way" that bypasses the process entirely. And now you have neither safety nor usability.

The irony: Overly complex internal tooling often produces worse security outcomes than simpler tooling — because people bypass what they can't use.

What the DNS change example actually reveals

Let's break down what went wrong in that Reddit example:

Each of these issues in isolation is minor. Together they create a system where making a DNS change requires expertise that has nothing to do with DNS.

The mindset shift: internal tools are products

The engineering teams using your platform are your users. They have the same right to a good user experience as your external customers. "Good enough for internal use" is not a standard — it's a symptom of deprioritised platform work.

Concretely, this means:

How to diagnose your own tooling

Ask someone who didn't build the tool to complete a routine operation — a DNS change, a new service deployment, an environment variable update — while you watch without helping. Don't intervene. Don't explain. Just observe where they get stuck.

Every point of confusion is a UX bug. Treat it like one.

Practical improvements that actually work

1. Collapse multi-system workflows into one trigger

# Bad: manual steps across three systems
# 1. Open PR in GitHub
# 2. Copy PR number
# 3. Paste into Jenkins job parameter
# 4. Click "Build Now"

# Good: one event triggers the whole workflow
# PR opened → GitHub webhook → Jenkins pipeline auto-triggers
# PR number passed automatically via ${{{{ github.event.pull_request.number }}}}

2. Make approval workflows part of the main interface

If your team uses Slack, approvals should happen in Slack. If they live in GitHub, approvals should be PR reviews. Don't introduce a third system for a single step in an otherwise two-system workflow.

3. Write runbooks for every non-obvious operation

The test: could a new engineer who joined last week complete this operation using only the runbook? If the answer is no, the runbook is incomplete. Every operation that touches production should have one.

4. Version and document your linting rules

# .github/CONTRIBUTING.md — make the rules visible
## DNS Record Format
Records must follow this format:
  name: subdomain only (no trailing dot)
  type: A | CNAME | MX | TXT
  ttl: integer in seconds (minimum 300)

# Automate enforcement with pre-commit hooks
# so engineers get feedback before pushing, not after

The question worth asking your team

What's the most frustrating internal tool you use regularly? Not the most complex — the most frustrating. That's your highest-priority platform work. Fix that before adding new features to anything.

Automation should make things easier. If it's making things harder, it's not automation — it's overhead with a pipeline attached.

← Back to all articles