Two-way database syncing in Rails 6.1
Example concern for syncing one DB to another with production traffic.
Learned this snippet from @kwuchu
class PlanetScaleSyncable extend ActiveSupport::Concern included do after_commit :commit_to_planetscale end # Realistically, you would call commit_to_planetscale in a background job to do it asynchronously. # This way, you get two benefits: # 1) Errors double writing don't mess up the same code path that this thing is on. # 2) There's less performance loss by double writing in the hot-path. # 3) You can use Sidekiq Ent. to ensure that in the case of heavy writes the jobs are unique until they begin being processed. # 4) If you make a mistake, you can blow away the queue easily. def commit_to_planetscale record = self # You will have to have a separate configuration for `planetscale_primary` in your `config/database.yml` file. This can technically be anything that has write permission. ActiveRecord::Base.connected_to(role: :planetscale_primary) do planetscale_record = self.class.find_by(id: record.id) || self.class.new(record.attributes) # Destroy the PlanetScale record if it exists and return early. if record.destroyed? && planetscale_record&.persisted? planetscale_record.destroy! return end # Otherwise, the record exists, let's update it's attributes with the latest attributes from the primary and save it. planetscale_record.attributes = record.attributes planetscale_record.save!(validate: false) # Note that we're skipping validations here, since it assumes the record is OK. end end end class ExampleModel < ApplicationRecord # All commits made to `ExampleModel` will be synced to the PlanetScale database. include PlanetScaleSyncable end
In Bento, we ran into an issue with syncing two databases up.
We didn't need an entire table to table sync, just the active rows moved over. The above snippet is how we pulled it off.
I owe a lot to Iheanyi Ekechukwu for coming up with the above concern — I was stuck thinking about it until he sent the entire thing as a gist (during that time we were investigating Planetscale — a MySQL based DB built on Vitess — as an analytics store for Bento). The above uses PlanetScale, the company he works at, as the database to sync to but could be switched out to anything you want (Citus, SingleStore, AWS RDS, etc). PlanetScale are doing cool things, you should check them out.
PlanetScale to your DB of choice.
For a quick run down on how to implement this:
- Use the concern above to get real-time syncing going with your new DB.
- Spin up a read-only replica on the primary.
- Find all the records on the replica and sync them to the new DB. Background jobs feel nice here.
- Confirm all the data is there.
- Switch reads on the table(s) you're moving to the new DB.
- Switch ALL writes on the table(s) you're moving to the new DB.
- Confirm all the data there.
- Dump the data in the primary as it's no longer needed.
- Run VACUUM FULL; (Postgres) or similar on the primary to regain disk space. This'll lock everything and could lead to downtime. Double-check your databases docs before running.
There's a missing piece of this puzzle: intelligent queuing and throttling of the sync. I'll cover that in another snippet when I've implemented it.
Upgrade to Rails 6.1
You're going to want to upgrade your Rails app to Rails 6.1 (or above first). This'll allow you to add multiple databases with the same or different schemas.
Add your new database
Add your new database to your configuration file. Read more about that here.
Queue up records to sync
In a future snippet, I'll show you how we do that. For now, write your own migration and touch all records you want to sync over to the new DB.
Subscribe to my personal updates
Get emails from me about building software, marketing, and things I've learned building products on the web. Occasionally, a quiet announcement or two.