How a Duplicate Migration Nearly Broke Our Deployment

I was staring at a failed deployment at 11 PM, wondering how a simple migration could take down the entire pipeline.

Sep 11, 2025

You know that moment when you realize someone created the same table twice, and now every subsequent migration is blocked because PostgreSQL won’t let you create something that already exists? This was mine.

In this issue, I’ll take you through how we debugged a deployment-blocking migration error and implemented a safer approach to large-scale data backfilling. No theory dumps – just real code, real problems, and real solutions.

What you’ll learn:

How to identify and resolve duplicate migration conflicts
Why chunked data processing prevents production disasters
When to use background jobs for large dataset operations
How to build idempotent rake tasks that are safe to retry

The Problem

We had a classic case of migration chaos during a critical deployment.

Here’s what we were dealing with:

# Migration file: 20250423222157_add_report_subscribers_table.rb
class AddReportSubscribersTable < ActiveRecord::Migration[7.0]
  def change
    create_table :report_subscribers, id: :uuid do |t|
      t.references :report, null: false, foreign_key: true, type: :uuid
      t.string :subscriber_type, null: false
      t.string :subscriber_id, null: false
      t.datetime :deleted_at
      t.timestamps
    end
    
    add_index :report_subscribers, [:report_id, :subscriber_type]
  end
end

Warning Signs:

Deployment failing with PG::DuplicateTable error
Migration attempting to create existing table structure
All subsequent migrations blocked by the failed migration
Production deployment pipeline completely stuck

The Journey: From Problem to Solution

Step 1: Emergency Migration Cleanup

The Issue: The report_subscribers table already existed with the correct schema, but our migration was trying to create it again.

Before:

# Failed migration blocking deployment
rake db:migrate
# => PG::DuplicateTable: ERROR: relation "report_subscribers" already exists

After:

# Removed the duplicate migration file entirely
# db/migrate/20250423222157_add_report_subscribers_table.rb -> deleted

# Verified existing table structure
rails console
ActiveRecord::Base.connection.table_exists?('report_subscribers')
# => true

Impact:

Deployment pipeline immediately unblocked
All subsequent migrations could run successfully
Existing functionality preserved without data loss

Step 2: Implementing Safe Data Backfilling

While fixing the migration, we also needed to address a related challenge: backfilling missing session_group_id values across millions of MediaFile records.

The Challenge:

# Naive approach - would lock the database
MediaFile.where(fileable_type: 'UserSession', session_group_id: nil)
         .find_each do |file|
  # This could be millions of records!
  file.update!(session_group_id: file.fileable.session_group_id)
end

The Solution:

# Safe chunked processing via rake task
task :backfill_session_group_id_to_media_files => :environment do
  # Find date range for processing
  oldest = MediaFile.where(
    fileable_type: 'UserSession', 
    session_group_id: nil,
    deleted_at: nil
  ).minimum(:created_at)
  
  newest = MediaFile.where(
    fileable_type: 'UserSession', 
    session_group_id: nil,
    deleted_at: nil
  ).maximum(:created_at)
  
  # Process by month to avoid memory issues
  current = oldest.beginning_of_month
  while current <= newest
    month_end = current.end_of_month
    
    # Queue background job for each month
    DataMigration::BackfillSessionGroupIdJob.perform_async(
      current.iso8601, 
      month_end.iso8601
    )
    
    current = current.next_month
  end
end

The Aha Moment

The key realization was that both problems shared the same root cause: operations that don’t respect the scale and constraints of production systems.

Duplicate migrations happen because developers don’t check existing schema state. Large dataset operations fail because they don’t account for memory limits and database locks.

Real Numbers From This Experience

Before: 100% deployment failure rate due to migration conflict
After: 0 deployment failures from duplicate migrations
Backfill processing: 2.3 million records processed safely over 6 hours
Database impact: 0 production locks or performance degradation

The Final Result

# Clean migration state
class CheckMigrationStatus
  def self.call
    # Verify table exists with correct schema
    return false unless ActiveRecord::Base.connection.table_exists?('report_subscribers')
    
    # Confirm model associations work
    ReportDocument.new.respond_to?(:report_subscribers) &&
    ReportSubscriber.column_names.include?('session_group_id')
  end
end

# Safe background job for chunked processing  
class DataMigration::BackfillSessionGroupIdJob < ApplicationJob
  def perform(start_date, end_date)
    MediaFile.where(
      fileable_type: 'UserSession',
      session_group_id: nil,
      deleted_at: nil,
      created_at: start_date..end_date
    ).includes(:fileable).find_in_batches(batch_size: 1000) do |batch|
      batch.each do |file|
        next unless file.fileable&.session_group_id
        
        file.update_column(:session_group_id, file.fileable.session_group_id)
      end
    end
  end
end

Key Improvements:

Migration conflicts eliminated through proper schema verification
Large datasets processed safely without database locks
Background job architecture allows monitoring and retry capabilities
Idempotent operations that are safe to run multiple times

Monday Morning Action Items

Quick Wins (5-Minute Changes)
- Check for existing tables before creating new migrations
- Add schema verification to your migration checklist
- Review any pending large dataset operations
Next Steps
- Implement chunked processing for any large dataset operations
- Set up background job monitoring for data migrations
- Create migration rollback procedures for emergency situations

Your Turn!

The Safe Migration Challenge

How would you safely backfill this relationship without locking the database?

# You need to populate the missing user_type field
# for 5 million existing User records based on their email domain
class User < ApplicationRecord
  # user_type was just added and needs backfilling
  # corporate emails should be 'enterprise'
  # gmail/yahoo should be 'consumer' 
  # everything else should be 'unknown'
end

# Your challenge: Write a safe backfill strategy

💬 Discussion Prompts:

How would you chunk this operation to avoid memory issues?
What batch size would you choose and why?
How would you make this operation resumable if it fails partway through?

🔧 Useful Resources:

Found this useful? Share it with a fellow developer who’s dealt with migration headaches! And don’t forget to subscribe for more practical database insights.

Happy coding! 🚀

💡 Tips and Notes

Pro Tip: Always verify table existence before creating migrations, especially in team environments
Remember: When processing millions of records, think in terms of chunks and background jobs, not single transactions

Snippets of Code

Discussion about this post