How a Duplicate Migration Nearly Broke Our Deployment
I was staring at a failed deployment at 11 PM, wondering how a simple migration could take down the entire pipeline.
You know that moment when you realize someone created the same table twice, and now every subsequent migration is blocked because PostgreSQL won’t let you create something that already exists? This was mine.
In this issue, I’ll take you through how we debugged a deployment-blocking migration error and implemented a safer approach to large-scale data backfilling. No theory dumps – just real code, real problems, and real solutions.
What you’ll learn:
How to identify and resolve duplicate migration conflicts
Why chunked data processing prevents production disasters
When to use background jobs for large dataset operations
How to build idempotent rake tasks that are safe to retry
The Problem
We had a classic case of migration chaos during a critical deployment.
Here’s what we were dealing with:
# Migration file: 20250423222157_add_report_subscribers_table.rb
class AddReportSubscribersTable < ActiveRecord::Migration[7.0]
def change
create_table :report_subscribers, id: :uuid do |t|
t.references :report, null: false, foreign_key: true, type: :uuid
t.string :subscriber_type, null: false
t.string :subscriber_id, null: false
t.datetime :deleted_at
t.timestamps
end
add_index :report_subscribers, [:report_id, :subscriber_type]
end
end
Warning Signs:
Deployment failing with
PG::DuplicateTable
errorMigration attempting to create existing table structure
All subsequent migrations blocked by the failed migration
Production deployment pipeline completely stuck
The Journey: From Problem to Solution
Step 1: Emergency Migration Cleanup
The Issue: The report_subscribers
table already existed with the correct schema, but our migration was trying to create it again.
Before:
# Failed migration blocking deployment
rake db:migrate
# => PG::DuplicateTable: ERROR: relation "report_subscribers" already exists
After:
# Removed the duplicate migration file entirely
# db/migrate/20250423222157_add_report_subscribers_table.rb -> deleted
# Verified existing table structure
rails console
ActiveRecord::Base.connection.table_exists?('report_subscribers')
# => true
Impact:
Deployment pipeline immediately unblocked
All subsequent migrations could run successfully
Existing functionality preserved without data loss
Step 2: Implementing Safe Data Backfilling
While fixing the migration, we also needed to address a related challenge: backfilling missing session_group_id
values across millions of MediaFile
records.
The Challenge:
# Naive approach - would lock the database
MediaFile.where(fileable_type: 'UserSession', session_group_id: nil)
.find_each do |file|
# This could be millions of records!
file.update!(session_group_id: file.fileable.session_group_id)
end
The Solution:
# Safe chunked processing via rake task
task :backfill_session_group_id_to_media_files => :environment do
# Find date range for processing
oldest = MediaFile.where(
fileable_type: 'UserSession',
session_group_id: nil,
deleted_at: nil
).minimum(:created_at)
newest = MediaFile.where(
fileable_type: 'UserSession',
session_group_id: nil,
deleted_at: nil
).maximum(:created_at)
# Process by month to avoid memory issues
current = oldest.beginning_of_month
while current <= newest
month_end = current.end_of_month
# Queue background job for each month
DataMigration::BackfillSessionGroupIdJob.perform_async(
current.iso8601,
month_end.iso8601
)
current = current.next_month
end
end
The Aha Moment
The key realization was that both problems shared the same root cause: operations that don’t respect the scale and constraints of production systems.
Duplicate migrations happen because developers don’t check existing schema state. Large dataset operations fail because they don’t account for memory limits and database locks.
Real Numbers From This Experience
Before: 100% deployment failure rate due to migration conflict
After: 0 deployment failures from duplicate migrations
Backfill processing: 2.3 million records processed safely over 6 hours
Database impact: 0 production locks or performance degradation
The Final Result
# Clean migration state
class CheckMigrationStatus
def self.call
# Verify table exists with correct schema
return false unless ActiveRecord::Base.connection.table_exists?('report_subscribers')
# Confirm model associations work
ReportDocument.new.respond_to?(:report_subscribers) &&
ReportSubscriber.column_names.include?('session_group_id')
end
end
# Safe background job for chunked processing
class DataMigration::BackfillSessionGroupIdJob < ApplicationJob
def perform(start_date, end_date)
MediaFile.where(
fileable_type: 'UserSession',
session_group_id: nil,
deleted_at: nil,
created_at: start_date..end_date
).includes(:fileable).find_in_batches(batch_size: 1000) do |batch|
batch.each do |file|
next unless file.fileable&.session_group_id
file.update_column(:session_group_id, file.fileable.session_group_id)
end
end
end
end
Key Improvements:
Migration conflicts eliminated through proper schema verification
Large datasets processed safely without database locks
Background job architecture allows monitoring and retry capabilities
Idempotent operations that are safe to run multiple times
Monday Morning Action Items
Quick Wins (5-Minute Changes)
Check for existing tables before creating new migrations
Add schema verification to your migration checklist
Review any pending large dataset operations
Next Steps
Implement chunked processing for any large dataset operations
Set up background job monitoring for data migrations
Create migration rollback procedures for emergency situations
Your Turn!
The Safe Migration Challenge
How would you safely backfill this relationship without locking the database?
# You need to populate the missing user_type field
# for 5 million existing User records based on their email domain
class User < ApplicationRecord
# user_type was just added and needs backfilling
# corporate emails should be 'enterprise'
# gmail/yahoo should be 'consumer'
# everything else should be 'unknown'
end
# Your challenge: Write a safe backfill strategy
💬 Discussion Prompts:
How would you chunk this operation to avoid memory issues?
What batch size would you choose and why?
How would you make this operation resumable if it fails partway through?
🔧 Useful Resources:
Found this useful? Share it with a fellow developer who’s dealt with migration headaches! And don’t forget to subscribe for more practical database insights.
Happy coding! 🚀
💡 Tips and Notes
Pro Tip: Always verify table existence before creating migrations, especially in team environments
Remember: When processing millions of records, think in terms of chunks and background jobs, not single transactions