

Legacy Data Migration: A CTO's Playbook
TL;DR
Most legacy data migrations slip by quarters, not weeks. Here's what actually causes it, and what to do instead.
- The cause is rarely the technology. It's ownership gaps, undocumented business logic, sunk-cost paralysis, and cutover risk.
- Use a four-phase playbook: Discovery, Decouple, Rebuild, Cutover. The decisions that determine the outcome are made in month one, not month twelve.
- AI has changed the economics. SJP cut their ETL migration effort by roughly two-thirds using Maia.
- The Migration Agent is the capability inside Maia that handles legacy ETL migration. It removes the boring middle so engineers can focus on the decisions that actually matter.
Most legacy data migrations slip. Not by weeks, by quarters. Across customer engagements, we hear the same diagnosis: the cause is rarely the technology. It's the way the work gets framed, sequenced, and owned.
A familiar pattern. A team at a global manufacturer spends roughly $50,000 trying to ship one marketing pipeline. They build it themselves and it fails. They bring in consultants and it fails. They pay a specialist vendor and get partial data they can't use. A year passes. The campaigns keep spending. The visibility never arrives.
That's not a tooling problem. That's a playbook problem.
If you're a CTO scoping a migration this quarter, here's what the CTOs we work with wish they'd known at kickoff.
Why CTOs get stuck
Migrations stall in predictable places. The technology gets the blame, but in customer post-mortems we hear about the same handful of structural issues.
The first is ownership. Data engineering owns the pipelines. Platform owns the warehouse. Application owns the source systems. Migration cuts across all three, and nobody owns the whole arc. Work falls into the gaps, and progress depends on which director is loudest that quarter.
The second is undocumented business logic. The team that wrote the original pipelines retired, left, or got promoted out of the codebase years ago. What's left is a 15-year-old job nobody wants to touch because nobody can fully explain what it does. Or, more often, what the spreadsheets feeding it do. We'll come back to that.
The third is sunk-cost paralysis. Companies that have spent seven figures on a legacy platform find it psychologically difficult to retire it, even when keeping it costs more in engineering time than the migration would. For most businesses still running legacy ETL, this is the quiet reason their AI roadmap keeps slipping.
The fourth is cutover risk. If the migration breaks the business for a day, careers end. So teams default to long parallel runs, which extend timelines, which inflate costs, which make leadership impatient, which makes the team cut corners in the rebuild. The risk you tried to avoid arrives anyway.
Edmund Optics hit three of those four before they brought in Maia. The technology was the easy part.
The four phases of a defensible migration
We've seen migrations succeed in different shapes, but the ones that ship on time tend to follow the same four phases. Treat them as checkpoints, not waterfall stages. You can iterate inside each one. Don't skip ahead.
Phase 1: Discovery
Before anyone writes new code, map what you actually have. That means inventorying every pipeline, every dependency, and every downstream report. It also means mapping data lineage from source to consumer, including the spreadsheets that feed and consume your warehouse. Most CTOs underestimate this step by half.
The output of Discovery is a list with three columns: keep as-is, redesign, retire. If you skip the retire column you're going to migrate dead code into your new platform and discover it years later when nobody can explain why it's there.
Phase 2: Decouple
Every legacy pipeline has implicit dependencies. The job runs at 2 a.m. because something else finishes at 1:50. The transformation assumes column ordering. The downstream dashboard breaks if a header changes.
Decouple before you rebuild. Pull dependencies into explicit contracts so the new pipeline can run independently of the legacy one. If you skip this, the parallel-run phase becomes a nightmare. You can't tell whether failures are migration bugs or pre-existing bugs that were always there.
Phase 3: Rebuild
This is where most CTOs put their attention, and it's the least interesting phase. The technology choices are largely commodified. What matters here is throughput and quality control. How many pipelines per week can your team actually convert, and how confident are you in the test coverage?
Hold yourself to a measurable rebuild rate. If you're doing one pipeline a week and you have 200 to do, you have a four-year project, not a migration.
Phase 4: Cutover
Cutover is where governance meets operations. You need clear go/no-go criteria, a rollback plan that's been tested (not just documented), and a parallel-run period long enough to catch month-end edge cases but not so long that it doubles your costs.
The teams that get cutover right treat it like a release, not a rebrand. Small batches, reversible, and boring on purpose.
The decisions that determine the outcome
Most migration outcomes are decided in the first month, not the last. Five decisions in particular do most of the work.
Lift-and-shift versus rebuild. Lift-and-shift is faster and lets you decommission the old platform sooner. The price is that you carry forward every bad decision baked into 15 years of legacy code. Rebuild costs more upfront but compounds in your favor. The rule of thumb: lift-and-shift for pipelines with stable, well-understood logic; rebuild for anything that's been patched more than three times.
How long to run in parallel. Long enough to cover one full close cycle, then stop. Anything beyond that doubles your operational load without proportional risk reduction.
What to redesign versus port like-for-like. Anywhere business logic lives in a place it shouldn't, redesign. Spreadsheet calculations feeding the warehouse. Stored procedures with hardcoded constants. Reports that recompute the same metric three different ways. Migration is your one chance to fix these without political cost.
How to treat undocumented logic. This is the one that bites everyone. Nature's Touch, a global frozen-food supplier, ran a 72-page Excel model in production for years. Inside it was a pounds-to-kilograms conversion error that quietly overstated inventory by $500,000 a year. Their ERP and MRP systems both processed the bad data without flagging it, because neither could audit the spreadsheet logic feeding them. When the team brought Maia in to reconstruct and validate the model's logic, they found the variance. The reconciliation that used to take hours of manual analysis now takes 10 minutes. The lesson: undocumented logic isn't a footnote in your migration. It's the migration.
What to retire. Be aggressive. Every pipeline you carry forward is a pipeline you'll maintain for another decade. If a CTO can't articulate why a job exists in 30 seconds, kill it. The real cost of legacy ETL migration isn't the licence fee. It's the carrying cost of complexity you should've cut.
For organizations consolidating multiple ETL tools at once, the retire decision is also a platform consolidation decision, and the two should be made together.
How AI is changing the economics of legacy ETL migration
Two years ago, the boring middle of any migration (pipeline conversion, schema mapping, lineage analysis) was where months disappeared. That's the part that's changing fastest. Maia is the reason. At Balfour Beatty, pipeline analysis that used to take a week now takes six minutes. A 1,300-pipeline PowerCenter migration originally scoped as a multi-year programme is now on a six-month delivery window. Same engineers. Same business logic. Different economics.
Maia is the AI Data Automation platform. It's how teams now do the repetitive translation work that used to consume engineering capacity for quarters at a time, while keeping a human in the loop on every design decision that actually matters.
St. James's Place, one of the UK's leading wealth management businesses, tested Maia on two critical challenges, including ETL migrations that were bottlenecking their platform consolidation. Maia cut their ETL migration effort by roughly two-thirds, turning days of pipeline work into hours. Kelly Maggs, SJP's Divisional Director for Data Architecture, Platform and Engineering, put it plainly: "The big productivity numbers you hear about AI can actually be real."
That second sentence matters more than the first. Most CTOs have spent the last 18 months separating real AI productivity gains from vendor noise. S&P Global Market Intelligence reports that 42% of enterprises abandoned most AI initiatives in 2025, up from 17% the year before. Against that backdrop, SJP's result in a heavily governed financial services environment is one of the cleanest data points we've seen.
Inside Maia, the Migration Agent is the specific capability that handles legacy ETL migration. It takes the repetitive translation work off your engineers' plate so they can focus on the decisions in the previous section. Across early customers, the Migration Agent has unlocked more than $18 million in previously frozen data value. The migrations that used to be six-figure, 18-month consulting engagements are now finishing in days. Legacy ETL is the hidden constraint on AI execution for most enterprises, and that constraint is starting to give.
If your migration plan still assumes pipeline conversion is the long pole, your plan is out of date.
What to do Monday morning
If you're scoping a migration this quarter, three things to do this week.
First, map your dependencies before you scope your timeline. Most slipped migrations slip because the discovery work was too thin to surface real risk.
Second, draft your retire list. Not your migrate list. The shortest path to a faster migration is to migrate less.
Third, find a credible reference point on what AI-assisted conversion is actually delivering for organizations like yours. Not a vendor pitch deck. A customer reference. Maia's customer base is full of them. The economics have changed. Stop blaming budget for your migration stall and start interrogating the assumptions in your plan.
The CTOs who get this right in the next 18 months will spend the following five years compounding the advantage. The ones who don't will still be migrating.
Lift-and-shift moves the existing logic to a new platform with minimal changes. Modernization redesigns the logic itself. Most successful migrations do both. Use lift-and-shift for stable, well-understood pipelines, and rebuild for anything that's been patched repeatedly or where business logic has ended up in the wrong place.
The defensible path is four phases: Discovery (inventory and lineage), Decouple (make dependencies explicit), Rebuild (convert pipelines with quality controls), and Cutover (governed release). AI-assisted conversion compresses the Rebuild phase significantly, but it doesn't change the other three. Skipping Discovery is the most common reason migrations slip.
The technical risks (data loss, broken downstream reports, cutover failures) are well understood. The bigger risks are organizational. Undocumented business logic is the one that catches most teams out. So is ownership ambiguity between data, platform, and application teams. And so is sunk-cost attachment to the legacy platform, which delays the call to retire it.
There's no single answer, and any vendor that gives you one is selling, not advising. The duration is driven by three variables: how many pipelines you have, how complex the business logic inside them is, and how fast your team can convert and validate them. The third variable is where AI-assisted conversion is making the biggest difference. SJP, one of Maia's financial services customers, saw their ETL migration effort cut by roughly two-thirds, turning days of pipeline work into hours. Sophos, also migrating off Informatica PowerCenter, saw a 98% productivity lift on documentation and testing, with five days of work compressed into 30 minutes. Run the math against your own pipeline count and rebuild rate to estimate your timeline.
Legacy data migration is the process of moving data, pipelines, and business logic off an older platform (often an on-premise ETL tool, mainframe, or legacy data warehouse) onto a modern cloud data platform. It usually involves both technical translation of pipelines and a redesign of how data flows.
Enjoy the freedom to do more with Maia on your side.

Related Resources



