tl;dr : Today, we are launching Altimate Code, an open-source agentic data engineering harness that far exceeds generic LLMs (and other leading models) on data engineering tasks. Check out the project on GitHub, or catch the March 25th overview and AMA on YouTube.
AI agents have transformed how software gets built. But for data engineering, they have fallen short…
An AI coding agent on Replit deleted an entire production database during a code freeze, then created 4,000 fake records to fill the empty tables. An independent evaluation of Snowflake's Cortex Analyst found 38% logical accuracy. Six out of ten queries were wrong, but compiled and ran just fine. One team got a $5,000 bill from a single Cortex AI query their resource monitors never caught. 78% of AI-generated SQL errors are silent wrong joins, queries that return confidently incorrect data.
These aren't edge cases. They're what happens when AI agents operate on data infrastructure without safety layers. No schema validation, no lineage, no cost controls, no permission enforcement. The agent doesn't know your schema. It can't trace what breaks downstream. It doesn't know what a query will cost. And the system prompt telling it "don't drop tables" stops working at 100K context tokens.
The problem isn't the model. It's everything around it.
The missing layer
General-purpose coding agents treat SQL like application code. It isn't. SQL operates on schemas that change, across dialects that diverge on every function name, through lineage chains no LLM can reliably trace, with cost implications that scale to thousands of dollars per mistake.
What data engineering needs is a layer of compiled, deterministic tools that operate outside the LLM's reasoning loop. Tools that validate SQL against your actual schema in 2ms, trace column-level lineage through CTEs deterministically, and catch anti-patterns with zero false positives. Not better prompts. Better engines.
That's what we built.
Introducing Altimate Code
Today we're open-sourcing Altimate Code, a data engineering harness built on OpenCode, the open-source coding agent. Built by the team behind dbt Power User, the most widely used dbt VS Code extension.
The core idea: the LLM reasons; compiled engines validate; neither replaces the other.
We ran dbt Labs' ADE-bench, the open standard for measuring AI agents on real data engineering tasks.
A cheaper model with compiled tools outperformed a more expensive model without them. The difference isn't the model. It's the harness. Further details can be found here.
What the harness catches
Your agent writes a query. Before it touches your warehouse:
The wrong table name that would have triggered a 30-second Snowflake error? Caught in 2ms with a fix suggestion. Five agent fix cycles cost 10ms and not 2.5 minutes of warehouse round-trips.
The cartesian join that would have silently inflated your numbers by 100x? Caught before execution. 26 compiled anti-pattern rules, zero false positives across 1,077 benchmark queries.
The column your downstream dashboard depends on? Traced to its source through every JOIN, CTE, and subquery with 100% edge match on 500 benchmark queries. The agent reasons on verified lineage, not guesses.
The PII in the staging table you're about to expose? Flagged and masked. The SQL injection hiding in a Jinja template? Blocked.
All compiled. All deterministic. All in milliseconds.
Making the harness your own
Engines are the foundation. But what makes Altimate Code yours is what you build on top of them.
Persistent memory spans across sessions in two scopes: global (your preferences) and project (team knowledge versioned in git). Tell the agent "we never use FLOAT for money columns"… it remembers. Next session, next teammate, the knowledge is there. When one engineer corrects the agent, every teammate inherits the fix on git pull. No Slack message. No wiki update. The correction just propagates.
Governed agent modes enforce permissions at the engine level, not through prompt instructions that models ignore at long context lengths. The Analyst can't INSERT, UPDATE, DELETE, or DROP. Not because of a system prompt, but because the compiled engine won't execute it. The Builder gets full read/write with your SQL rules applied. The Planner maps tasks without executing.
The compactor summarizes long sessions while preserving data engineering state: warehouse connections, schema context, dbt project state, lineage findings. Multi-hour sessions maintain continuity across compaction boundaries.
The tracer captures every LLM call, tool invocation, and warehouse metric locally. No external services. No data leaving your machine. Run /trace for an interactive viewer.
Why independent?
Snowflake shipped Cortex Code. Databricks launched Genie Code. Both recognized that general-purpose agents don't work for data. But both shipped solutions that suit their ecosystem.
Cortex Code won't help you migrate off Snowflake and onto BigQuery. Genie Code is not designed to optimize a Redshift query. Your Airflow DAGs don't run inside Databricks. Your warehouses span multiple providers. Your governance crosses every platform boundary. Your AI agents should too.
Altimate Code connects to Snowflake, BigQuery, Databricks, PostgreSQL, Redshift, DuckDB, MySQL, SQL Server, Oracle, and SQLite. It runs with Anthropic, OpenAI, Google, AWS Bedrock, Azure, Ollama, and OpenRouter. With local models and the local-only tracer, it runs fully air-gapped so no data leaves your machine.
Platform agents will always have telemetry we can't access. We'll always have independence they can't offer. Neither Snowflake nor Databricks will build first-class support for the other's warehouse. Neither will tell your agent a query is unnecessary when their revenue depends on you running it.
Your harness should be yours.
Try it
npm install -g @altimateai/altimate-code
altimate /discover
Two commands. It auto-detects your warehouse, indexes your schema, and you're building.
Open source on GitHub. Docs at altimate-code.sh. Ten data stores. Any LLM. No platform tax.
What we're building next, and where you can help
The compiled engines ship today. Here's where we're headed and where we need the community:
Pipeline monitoring for Airflow and Dagster. We want this to be proactive, not just interactive.
Blast radius analysis. Before the agent acts, show what breaks downstream.
Decision memory: extract why things were built from your Git history so agents stop undoing decisions they don't know were made.
Agent Data sandboxes: changes prove themselves before touching production and many more
Some of these we'll build. Some of them you'll build first. The roadmap, the benchmarks, and every known gap are in the repo. PRs welcome. Issues welcome. Forks welcome.
We'd rather build this in the open than wait for it to be built inside a walled garden. If you're a data engineer who stitches tools together for a living, come break it.
Want to learn more or get involved?
Three options:
Check out the project on Github - star the repo to support the project
Join us on Slack - we are launching a new Slack channel for our Agentic Data Engineering efforts. Join us here.
Sign up for next week's live overview and AMA on YouTube.

