Audience Data Partition Schema Changes

The audience data is partitioned by app and whenever we make schema changes we need to migrate existing tables to the new schema. This document describes the steps needed to migrate the audience data partition schema.

Migration files

Each migration should be created as a runnable script that can be executed manually on staging and production. These scripts are placed in the server/scripts/audience_data_table_migrations folder.

Script example

You can find examples of past migrations in the server/scripts/audience_data_table_migrations folder but a typical migration function for adding a column to all partitions would look like this. In this example, we are adding an updated_at column to all tables that is missing the column. The migration is done in a transaction to ensure that all tables are updated or none.

export async function addUpdatedAt(): Promise<void> {
  const columnName = 'updated_at'
  const tablesNames = await AudienceData.getPartitionsWithMissingColumn(columnName)
  logger.info({ count: tablesNames.length }, 'tables-to-update')
  await knex.transaction(async (tx: Knex.Transaction) => {
    for (const tableName of tablesNames) {
      logger.info({ tableName }, 'updating-table')
      await tx.schema.withSchema(audienceDataSchemaName).alterTable(tableName, (table) => {
        table.timestamp(columnName).notNullable().defaultTo(knex.fn.now())
      })
    }
  })
}

GAP configuration

To be able to run the migration scripts on staging and production we need to add the migration scripts to the GAP configuration. The script is added as a suspended cronjob in the GAP.yaml.

  audience-data-migration-add-updated-at:
    command: [dumb-init]
    args: [npm, run, start:audience-partitions-add-was-updated:prod]
    suspend: true
    schedule: "0 0 1 1 0"

Triggering the migration

Once deployment is done the migration script can be executed by triggering the cronjob manually in k9s.

k9s -c cronjob

Locate your cronjob and press t to trigger the cronjob.

Cleaning up

Once the migration is done on both staging and production the cronjob configuration in GAP.yaml can be removed. The migration script can be kept in the repository for future reference.