Deployment Patterns
Architecture for the moment code reaches users — the patterns for traffic shifting, feature exposure, rollback, and cadence that make production changes safe enough to be frequent, and frequent enough to be safe.
Architecture for the moment code reaches users — the patterns for traffic shifting, feature exposure, rollback, and cadence that make production changes safe enough to be frequent, and frequent enough to be safe.
A deploy-as-event approach treats deployment as a discrete moment: code is built, code goes to production, code runs. This worked when teams deployed quarterly, when each deploy was a project, and when the architectural decisions all lived inside the source code. It does not work when teams deploy daily — every step in that "discrete moment" must be safe, reversible, observed, and decoupled from end-user impact, and "the moment" stops being a moment and becomes the system's fundamental rhythm.
A deployment-as-architecture approach treats how code reaches users as a first-class concern, with its own design surface. Deployment frequency, traffic-shifting strategy, rollback speed, feature-flag granularity, schema-migration coupling — these are not operational details negotiated at release time. They are properties the architecture either has or lacks, and they determine whether the team ships safely fifty times a day or unsafely once a quarter.
The architectural shift is not "we set up CI/CD." It is: deployment is the moment the architecture's choices become real, and how that moment is engineered determines almost every other property of the system — risk, recovery, cadence, feedback, learning rate.
Deploying code (placing it in production where it can run) and releasing a feature (exposing it to users) are different operations performed at different times by different mechanisms. Conflating them — assuming deploy is release — forces every deploy into a release window, multiplies risk, and slows the team to whatever the riskiest pending change can tolerate. Decoupling is mechanical: feature flags, dark launches, progressive exposure, environment-based gating. The architectural insight is that they ARE separable, and the cost of separating them is paid once.
Could you turn off your most recently shipped feature for one specific customer or region without redeploying? If the answer is "no, we'd have to revert the code," you have not separated deploy from release — every release decision is paying full deployment cost.
Martin Fowler, Feature Toggles. Note that he distinguishes four toggle categories — release, experiment, ops, permission — each with different lifecycle expectations.
A canary that gets 1% of traffic but is never compared to baseline metrics is a slow rollout, not a canary. The pattern requires automated comparison of error rates, latency, and business metrics between the canary version and the baseline version, with predefined abort criteria and a decision authority. Without these, the only thing the gradual rollout achieves is making failures slower to detect — small blast radius for longer is not the same as small blast radius caught early.
Pick the most recent canary deployment your team ran. What were the success criteria, and were they evaluated automatically? If the answer is "no errors in the dashboard," that's not statistical comparison — that's wishful checking, dressed in canary clothes.
Google SRE Book, Release Engineering. The release-engineering chapter sets out the operational discipline; the broader book extends it through canary analysis and abort handling.
Every deployment must be reversible. If rollback requires a "rollback project" — coordinating database restoration, cache invalidation, queue replay across teams — there is no rollback, only delayed disaster. The architecture's design decisions determine whether rollback is a button press or a multi-day operation. Rollback speed should match (or beat) deployment speed; if it's slower, the deployment cadence is too aggressive for what the architecture can recover from.
If your most recent deployment had to be rolled back right now — not in an hour, right now — what would happen? If the answer is "we'd need to schedule the rollback," the architecture has not made rollback a first-class capability, and every deploy is taking on unmanaged risk.
Jez Humble & David Farley, Continuous Delivery. The discussion of rollback runs through the entire book; it is treated as a property of the deployment architecture rather than an emergency procedure.
The code side of blue-green is straightforward: run two complete environments, switch traffic between them. The hard problem is everything that has STATE: the database schema both versions must understand, the cached entries both produce, the in-flight messages on the queue, the file storage both reference. Most blue-green failures are state failures, not deployment failures. Pretending otherwise — focusing on the load-balancer flip while glossing over the database migration — is the most common form of theatre in the pattern's name.
In a blue-green cutover, what happens to a transaction that started on blue and completes on green? If you don't have a clear answer, the pattern is being practiced on the easy parts only — and the failure mode you'll discover later is the one nobody planned for.
Martin Fowler, Blue-Green Deployment. The original article is short; the practical depth comes from operating the pattern on systems with real state.
Teams that deploy fifty times a day and teams that deploy once a quarter are running fundamentally different architectures, regardless of how similar the source code looks. Cadence is a forcing function. Daily deploys require small changes, fast tests, automated rollback, decoupled release, and granular observability. Quarterly deploys require none of those, which is why they don't have them, which is why they can only deploy quarterly. The lock-in works in both directions, and breaking out of low-cadence requires fixing the architecture, not adding willpower.
What is your team's deployment frequency this week? If you don't have a number — measured, not estimated — cadence is not currently an engineered property, and the team is shipping at whatever rate the system tolerates rather than the rate it could.
DORA Research Program. Deployment Frequency is one of the four key metrics; the relationship between cadence and the other three (lead time, change failure rate, MTTR) is the central finding of the research.
Where the application runs (regions, availability zones, edge locations), how it's reached (load balancers, service mesh, CDN), and how versions coexist (single-version, blue-green, canary, A/B) is not an operational detail discovered at deploy time. It is architecture, owned alongside the code, versioned alongside the code, reviewed alongside the code. Infrastructure-as-code is the mechanism; treating deployment topology as part of the application — not a hand-off to a different team in a different repository — is the principle.
Pick your most recent deployment. Was the topology configuration in the same review as the code change, or in a separate ticket handled by another team in another repository? If separate, the topology is operationally owned, not architecturally owned — and the gap is where most deployment incidents live.
OpenGitOps Principles sets out the modern statement of declarative, versioned topology. Trunk-Based Development describes the development practice that makes high-cadence deployment topology workable.
The diagram below shows a canonical canary deployment architecture: the developer ships a change, CI produces an immutable artifact, two versions of the application coexist in production behind a traffic router, observability captures user-facing signals, and a decision layer (automated where possible) shifts traffic forward or rolls it back based on metric comparison.
"We don't deploy on Fridays" is presented as wisdom but is usually the symptom of fragile architecture. Teams that can't deploy safely on Friday can't safely deploy on Tuesday either — Tuesday simply has a five-day buffer before the weekend exposes whatever failed.
Treat the inability to deploy on Friday as a signal. The architecture lacks one or more of: fast rollback, observability, automated abort, decoupled release, schema-migration safety. Fix those, and Friday deploys become routine. The goal is to deploy whenever a change is ready, not to schedule risk around the calendar.
Feature flags introduced for a launch and never removed. After eighteen months, the flag system has more configuration paths than the application has user-facing features. Every code path is now conditional on flag state, every test must consider every combination, and the flag system itself has become the most fragile part of the architecture.
Every feature flag has an owner and a removal date at creation time. Lifecycle types matter: release flags die after rollout; experiment flags die at conclusion; permission flags persist (and should be a different mechanism). When a flag's purpose has expired, removing it is a P1 follow-up, not a backlog item that ages quietly into permanence.
Gradual rollout with no automated comparison to baseline. The canary stage is just a delay, the dashboard is glanced at by whoever has time, and "looks fine" is the abort criterion. The pattern's name is being practiced; the pattern's value is not.
Define the success criteria before the canary starts. Baseline metrics come from the previous stable version; comparison is automated; abort is automatic when criteria fail. The human role is exception handling, not primary monitoring.
Pretending you have rollback when you only have a longer redeploy. "Rolling back" by triggering a build of the previous commit takes ten minutes — the same ten minutes the original deploy took. During those ten minutes, the failed version is serving traffic. This is not rollback; it is delay with a recovery label.
Rollback shifts traffic; redeploy rebuilds and re-routes. Keep the previous version warm and addressable until the new version has soaked. The rollback action should be a routing change measured in seconds, not a build measured in minutes.
The single biggest reason "Friday deploys are scary" — every deploy includes a schema change that may or may not be reversible. Code rollback is fast; schema rollback is "we need to discuss this in the morning." So schema couples to code, which couples to risk, which couples to scheduling, which becomes the quarterly deploy nobody enjoys.
Schema migrations and code deploys are independently deployable, in either order. Migrations are forward-compatible with both N-1 and N application versions; rollback of code does not require rollback of schema. The two halves of the change are separate deployments with their own observation windows.
Rollback is a tested capability or it is not a capability. Documented procedures that have never been exercised are fiction the team will read for the first time during an incident, when fiction is the worst thing to be reading.
Without ownership and lifecycle, flags accumulate forever. The flag inventory should be queryable; flags past their removal date should appear in someone's queue, not buried in code that nobody reads.
Manual dashboard checking is not canary analysis. Automated comparison with predefined criteria is what makes a canary a statistical experiment rather than a slow rollout in disguise.
Coupled migrations make every deploy slower, riskier, and harder to roll back. The decoupling cost is paid once; the coupling cost is paid every deploy, forever, in compound interest.
Topology in a separate operational repository owned by another team is not architecture, it is hand-off. The application team should be able to read the topology in their main code review, not file a ticket to learn what production looks like.
The DORA finding is that frequency correlates with everything else worth measuring. If frequency isn't tracked, the team has no objective view of whether the deployment architecture is improving or degrading over time.
For critical paths (payments, identity, search, anything regulated), validating against production load before exposing users is the difference between catching issues in shadow and catching them in incident reviews.
DORA's lead-time-for-changes metric. Long lead times mean batched changes; batched changes mean each deploy carries multiple risks; multiple risks mean rare deploys; rare deploys mean even longer lead times. The cycle reinforces itself in either direction.
Automation eliminates the panic decision during incidents. The team's role becomes deciding whether the auto-rollback was correct, not deciding under stress whether to roll back at all.
Manual changes to the deploy pipeline, the CI system, the secret store — each is a category of change that can break the recovery path. Treating the deployment infrastructure as code with the same discipline closes that loophole before it becomes the incident.
Other substantive pages in the library that link here: