Upgrading PostgreSQL
Using the new Database Migration Service by Google it’s possible to set up a streaming replication to an instance with a newer version. All steps needed are well documented in the Upgrade Postgres with pglogical and Database Migration Service article by Google.
By using the migration service we create a new instance which is continuously replicating the original DB and once we’re done testing we can simply promote the new instance and drop the old one.
Steps
Create new migration process.
Ask the tech lead to create the destination to use same configuration as the current PG instance.
The db should be named on this format: me-inapp-<staging|production>-pg<version>
|
Prepare Source DB
-
Activate required database flags.
-
cloudsql.logical_decoding -
cloudsql.enable_pglogical
-
-
Create a new database user with the username
migration, generate and store the password. -
Login to the source database using the postgres user.
Run the following code on the postgres database
CREATE EXTENSION IF NOT EXISTS pglogical;
ALTER USER migration WITH REPLICATION;
GRANT USAGE ON SCHEMA pglogical to migration;
GRANT USAGE ON SCHEMA public TO migration;
GRANT SELECT ON ALL TABLES IN SCHEMA pglogical to migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON SEQUENCES TO migration;
GRANT SELECT ON ALL TABLES in SCHEMA public to migration;
GRANT SELECT ON ALL SEQUENCES in SCHEMA public to migration;
Run the following code on the inapp database
CREATE EXTENSION IF NOT EXISTS pglogical;
ALTER USER migration WITH REPLICATION;
GRANT USAGE ON SCHEMA pglogical to migration;
GRANT SELECT ON ALL TABLES IN SCHEMA pglogical to migration;
GRANT USAGE ON SCHEMA pglogical to "me-inapp@ems-mobile-engage.iam";
GRANT USAGE ON SCHEMA public TO migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON SEQUENCES TO migration;
GRANT SELECT ON ALL TABLES in SCHEMA public to migration;
GRANT SELECT ON ALL SEQUENCES in SCHEMA public to migration;
GRANT USAGE ON SCHEMA audience_data TO migration;
GRANT SELECT ON ALL TABLES IN SCHEMA audience_data TO migration;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA audience_data TO migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA audience_data GRANT SELECT ON TABLES TO migration;
ALTER DEFAULT PRIVILEGES IN SCHEMA audience_data GRANT SELECT ON SEQUENCES TO migration;
Make schema compatible
Drop campaign_priority constraint since it’s using a deferrable index which is not supported
ALTER TABLE campaign DROP CONSTRAINT campaign_priority
The auto-created table knex_migrations_lock is created without a primary index. In case the db instance has not been migrated before you need to create a PK to avoid warnings.
ALTER TABLE knex_migrations_lock ADD COLUMN dbid BIGSERIAL PRIMARY KEY;
Start migration process
1. Create connection profile
In Database Migration Service UI, create a new connection profile for source database instance (Cloud SQL for PostgreSQL).
Use the migration user for connection. Connection must be secured with the server-client
encryption type. In Cloud SQL UI for source database instance, create a new client certificate,
and download the certificate and private key. Also download the server certificate. Specify
them in connection profile settings.
2. Create migration job
In Database Migration Service UI, create a new migration job. Use the previously created connection profile for the source database instance. For the target database instance, specify the target instance that you already created. For connectivity method use VPC-peering. For parallelism level choose Maximum, to speed up the migration.
Test the migration job to verify that all the prerequisites were met. Once the job is ready, create it, but do not start it yet!
Prepare destination DB
Staging
Add new users in the cloud console.
Built-in
-
bigquery- The password is stored in the secret manager on using thedatabase-password-inapp-bigquerykey. Either use that password or generate a new ones and create a new version of the secret.
Grant permissions to the new users in the destination database using the postgres user.
Use the CLI tool from me-validate-database-migration repository.
.env file
# Connection options for source and migrated databases (see README)
# ...
ROLES_MIGRATOR_USERS='[{"username": "me-inapp@ems-mobile-engage-staging.iam", "roleType": "write"}, {"username": "me-device-event@ems-mobile-engage-staging.iam", "roleType": "read"}, {"username": "bigquery", "roleType": "read"}]'
Production
Built-in
-
bigquery- The password is stored in the secret manager on using thedatabase-password-inapp-bigquerykey. Either use that password or generate a new ones and create a new version of the secret.
Grant permissions to the new users in the destination database using the postgres user.
Use the CLI tool from me-validate-database-migration repository.
.env file
# Connection options for source and migrated databases (see README)
# ...
ROLES_MIGRATOR_USERS='[{"username": "me-inapp@ems-mobile-engage.iam", "roleType": "write"}, {"username": "me-device-event@ems-mobile-engage.iam", "roleType": "read"}, {"username": "bigquery", "roleType": "read"}]'
Make sure our custom pg-types are registered
Look up the OIDs used for our custom types using the following query
select typarray as oid, typname from pg_type where typname in ('test_contact', 'program_run');
When update the config in the inapp-service to use the new OIDs.
-
PG_PROGRAM_RUN_ARRAY_OIDS- a comma separated list of OIDs, keep the old one in case you need to rollback. -
PG_TEST_CONTACT_ARRAY_OIDS- a comma separated list of OIDs, keep the old one in case you need to rollback.
Wait for database to be in sync
The status in the database migration service indicates that the initial transfer has been finished and that it’s continuously syncing data by the message CDC.
Switch to new instance
-
Prepare instance connection name for inapp-service as well as for device-event-service
-
Turn off inapp workers to avoid DB-writes
-
Turn off inapp-web deployment
-
Validate the migration by using the me-validate-database-migration tool.
npm run start:validator
-
Change
PG_INSTANCE_CONNECTION_NAMEin config in inapp-service. -
Switch connection string in device-event-service
-
Promote destination database to primary in database migration service.
Cleanup
-
Remove
cloudsqlexternalsyncuser
Use tool in me-validate-database-migration to transfer the ownership of database objects and remove the user.
npm run start:ownership-migrator
-
Add back the campaign-priority constraint.
ALTER TABLE campaign ADD CONSTRAINT campaign_priority UNIQUE (customer_id, application_code, priority) DEFERRABLE INITIALLY DEFERRED
-
Configure backup for new database
-
Enable HA by selecting "Multiple zones" in "Zonal availability"
-
Import new instance to me-infrastructure and remove the old one. See this pull request for an example on how to do this.
Make sure to turn off the ASYNC_AUDIENCE_CREATION_ENABLED flipper once the migration is done and scale down the audience-partition-creation worker back to 0.
Limitations
pglogical does not support deferrable indexes
In in-app we use a single deferrable index for campaign-priority constraint. This constraint will be remove during the migration and added back once everything is done.
pglogical only migrates tables that have a primary key
Not a problem for inapp, all tables have primary keys expect the knex_migrations_lock table which should be automatically created by knex.