AI agents

Engineering

By Lumia Labs/ On 01 Mar, 2026

From DevSecOps to Agentic DevSecOps

DevSecOps was built on a simple premise: integrate security into the way humans build software. But "everyone" now includes AI agents. They write code and merge pull requests. Your security model is still designed for humans. Redesigning security for agents is what we call Agentic DevSecOps. It changes how we think about identity, access control, verification, and accountability. DevSecOps assumed humans in the loop The whole point of DevSecOps was to make everyone own security by shifting security left and building it into development, so vulnerabilities got caught before they reached production. All of that assumed a human developer writing the code, understanding the intent, reviewing scan results, and making judgement calls about risk. The developers can use tooling like static analysis to flag potential issues but ultimately a human evaluates whether the flag is a false positive or a threat. A reviewer reads the diff and considers the broader implications. Sonar's 2025 survey found 42% of production code already involves AI, and that number is climbing. Once agents start opening PRs and merging their own code, none of these assumptions hold. What makes Agentic DevSecOps different Agentic DevSecOps means redesigning security for a world where AI agents write and ship your code. Who is the agent, and what can it do? In traditional DevSecOps, access controls are tied to human identities. When an AI agent opens a PR, whose permissions does it use? What should it be allowed to do? In our experience, most organizations run agents under a developer's personal credentials, which means the agent inherits permissions that were calibrated for a human's judgement, not an AI's. AI agents also pick their own dependencies. Veracode's research found 45% of AI-generated code contains vulnerabilities. An agent can introduce a dependency that's technically clean but architecturally wrong, or generate code that mimics a vulnerable pattern without triggering signature-based detection. Speed breaks the verification model AI agents can generate and ship code ten to a hundred times faster than humans. A 15-minute security scan works fine when developers push a few times a day. When agents push dozens of changes per hour, that scan becomes either a bottleneck that defeats the purpose of using agents, or gets bypassed "temporarily". Agents leave plenty of trail in commit messages and PR descriptions. But by the time a broken change surfaces in production, the agent has pushed dozens more commits on top of it. Finding the offending commits can be a big challenge, even when using agents. AI agents fail differently than humans The threat model for AI-generated code is different. Humans make predictable mistakes: forgotten input validation, copy-pasted insecure patterns, hardcoded credentials, shortcuts under pressure. A reviewer can spot these. AI agents generate code that looks correct, passes basic checks, and is subtly wrong. The code reads well, it just doesn't do what you think it does. New attack vectors are already showing up in agentic workflows: Prompt injection through code context An attacker embeds malicious instructions in a codebase comment or issue description. The AI agent reads that context, follows the instructions, and introduces a backdoor that looks like a legitimate code change. Researchers have already demonstrated that LLMs can be manipulated through their input context. An AI agent asked to add a feature might pull in a dependency that doesn't exist yet. If an attacker registers that package name first, the agent helpfully installs the malicious package. Agents that modify lockfiles as part of their workflow bypass these protections. Zero human eyes An AI agent writes the code, another reviews it, an automated pipeline deploys it. Nobody planned for a fully automated path to production, but the steps chain together into one. AI agents need API keys and service credentials to do their work. An agent that logs its full context, or that includes secrets in a commit message or PR description, can expose credentials in places your secret scanning doesn't cover. The more autonomous the agent, the more credentials it touches. Improving security Give agents their own identity Create dedicated service identities for AI agents with scoped permissions. An agent that writes code shouldn't be able to merge it. An agent that runs tests shouldn't be able to modify the test configuration. In practice, we still find most agents running under a senior dev's personal token with full repo access. Treat them like any other service account: minimal permissions and audited access. Layer your verification A single security scan isn't enough. Stack static analysis, semantic analysis, behavioral testing, and anomaly detection on the diff patterns themselves. AI-generated code has detectable patterns, use that to improve verification. Slow agents down on purpose Put limits on how fast agents can push changes, and build circuit breakers that pause activity when anomalies appear: unusual dependency additions or changes to security-sensitive files. Track provenance The EU AI Act's transparency obligations already cover AI-generated code in regulated industries, and enforcement is coming. Every change should trace back to who (or what) wrote it, what prompted it, what context the agent had, and what review it received. Build the audit trail now. Enforce human review where it matters Not every change needs a human reviewer. But changes to authentication, authorization, payment processing, data handling, and infrastructure do. Define your high-risk zones and hold that line, even when it slows things down. The organizational shift Agentic DevSecOps is an organizational problem as much as a technical one. Security teams need to understand how agents fail. Dev teams should treat agents like a new hire: set guardrails, supervise the output. Platform infrastructure has to account for non-human participants in the pipeline. Organizations that get this right can deploy agents aggressively because they've built the controls to match. The alternative is bolting agents onto pipelines designed for humans and patching gaps after each incident.Lumia Labs helps organizations build secure engineering practices for AI-augmented teams. If you're deploying AI agents in your development pipeline and want to get security right, we'd like to hear from you.

Engineering

By Lumia Labs/ On 13 Feb, 2026

Who's Accountable for Your AI Agents?

In 2022, a customer asked Air Canada's website chatbot about bereavement fares. The chatbot confidently told him to book a full-price ticket and apply for a partial refund within 90 days. That was wrong. Air Canada's actual policy requires requesting the discount before booking. The customer spent over $1,500 CAD on flights he wouldn't have booked at full price. When he applied for the refund, Air Canada denied the claim. Then they argued the chatbot was "a separate legal entity" and the company wasn't responsible for its statements. A British Columbia tribunal disagreed and ordered Air Canada to pay damages. But "the AI said it, not us" is the defense organizations reach for first. The accountability in agentic AI can be a big challenge. The shift McKinsey is describing McKinsey's 2025 research on the agentic organization frames AI as the largest organizational paradigm shift since the Industrial Revolution. Their model envisions "flat networks of hybrid agentic teams" with "real-time, embedded governance and agentic controls with human accountability." That last phrase does a lot of heavy lifting. It assumes organizations will figure out how to keep humans accountable for systems that act autonomously. Most haven't. A tool does what you tell it. An agent decides what to do. When a developer uses an AI coding assistant, the developer reviews the output and takes responsibility. When an AI agent autonomously processes claims, triages support tickets, or adjusts pricing, accountability blurs. Accountability challenges Nobody designs an unaccountable AI system on purpose. It happens through gaps that individually seem manageable. Diffused ownership Multiple teams contribute to an agent's behavior: the ML team trains the model, the platform team deploys it, the product team defines the rules, the data team manages the inputs. When something goes wrong, each team owns a piece but nobody owns the outcome. Braham and van Hees call this the problem of many hands. The more people involved in a decision, the less any individual feels responsible for the result. Opacity of reasoning When an AI agent makes a decision, even the people who built it often can't explain why. The European Union recognized this in the EU AI Act, which requires high-risk AI systems to allow human oversight and provide explanations for their decisions. The regulation is ahead of actual capabilities across industries. You can't comply your way out of a black box. Speed exceeds oversight AI agents operate at virtually unlimited speeds. A human approval step that adds thirty seconds sounds fine, until agents scale up and the humans in the loop can't keep up. Organizations face a trade-off: slow the agent down enough for human review, or let it run fast. Fast often wins, because it is cheaper. Organizational inertia Even when teams recognize these problems, existing structures resist change. Governance committees move much slower than the developers can ship new AI agents. The org chart wasn't built for systems that cross every departmental boundary simultaneously. Autonomy without accountability is liability Consider the Boeing 737 MAX. The MCAS system made autonomous decisions about flight control, and Boeing didn't adequately inform pilots about its behavior. When the system encountered situations its designers hadn't anticipated, 346 people died. Subsequent investigations revealed diffused accountability across the board: engineers, managers, and regulators all shared responsibility, which meant nobody felt fully responsible. AI agents are already making decisions about credit, healthcare triage, hiring, and content moderation. In the Netherlands, a tax authority algorithm wrongly accused over 26,000 families of fraud. Thousands faced financial ruin. The entire cabinet resigned. The consequences don't need to look like a plane crash to be devastating. Stanford's Human-Centered AI Institute maintains a collection of policy resources documenting how organizations deploying AI systems consistently underestimate the governance needed. The technology moves fast, governance moves slow, and harm happens in between. What we think organizations should do Governance before autonomy costs money: more people, slower release cycles, developer time. It is a big risk not to spend this money however. Assign outcome owners, not component owners Every AI agent needs a single person accountable for what it does in production. One owner for outcomes, not one per component. This person needs authority to shut the agent down when something goes wrong. Build observability before autonomy You wouldn't deploy a critical service without monitoring and alerting, that's what DevOps is all about. AI agents need the same treatment: logging of decisions, monitoring and automated alerts when behavior is different from expectations. Without observability, governance is guesswork. Define your model explicitly Decide upfront whether humans review decisions before they happen (human-in-the-loop), after they happen (human-on-the-loop), or only when anomalies are detected (human-over-the-loop). Each model has different risk profiles. If you don't choose, you get 'human-over-the-loop', which can be too late. Design for explainability When building AI agents, include decision logging and reasoning traces as core requirements. Run pre-mortems Before deploying an agent, ask: "If this agent causes harm, who is accountable and how will we know?" If nobody can answer clearly, the agent isn't ready for production. The governance gap is a leadership problem Organizational structures to govern AI agents lag behind. The "real-time, embedded governance" that McKinsey's research envisions is the right destination, but getting there requires deliberate work on accountability structures, oversight models, and organizational culture. The organizations that figure this out first will build the trust, internal and external, that lets them deploy AI agents more ambitiously. That trust comes at a cost: dedicated governance roles and engineering effort spent on observability instead of features. But sustained innovation runs on accountability. Without it, ambition becomes liability.Lumia Labs partners with organizations building governance and engineering practices for AI agents. If you're working through how to deploy AI autonomy responsibly, let's talk.

AI agents

From DevSecOps to Agentic DevSecOps

Who's Accountable for Your AI Agents?

services

Contact Info