Query
SELECT
campaign_id,
rti_program_runs,
ac_program_runs
FROM
audience_data.campaign_contact_reference_master
WHERE
contact_reference = $1
AND campaign_id = ANY($2)
Partitions
Currently, ~6000 partitions partitioned by campaign_id
Would be ~400 partitions partitioned by application_code
Query Plan
Postgres has to do a lot of index scans which needs more IO.
Examples
Let’s compare a query with one campaign and a query with five campaigns to see how the number is index scans is growing:
One Campaign
inapp=> explain (analyze)
select campaign_id,
rti_program_runs,
ac_program_runs
from audience_data.campaign_contact_reference_master
where contact_reference = 'cr:c3bdcb49eb841c89:eu-central-1' and campaign_id = any (array [2480015]);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using campaign_contact_reference_2480015_contact_reference_idx on campaign_contact_reference_2480015 campaign_contact_reference_master (cost=0.42..8.44 rows=1 width=138) (actual time=0.039..0.041 rows=1 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015}'::integer[]))
Planning Time: 0.155 ms
Execution Time: 0.108 ms----
Five Campaigns
inapp=> explain (analyze)
select campaign_id,
rti_program_runs,
ac_program_runs
from audience_data.campaign_contact_reference_master
where contact_reference = 'cr:c3bdcb49eb841c89:eu-central-1' and campaign_id = any (array [2480015, 2480016, 2463849, 2463897, 2465950]);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.41..162.27 rows=5 width=113) (actual time=0.087..0.098 rows=1 loops=1)
-> Index Scan using campaign_contact_reference_2463849_contact_reference_idx on campaign_contact_reference_2463849 campaign_contact_reference_master_1 (cost=0.41..8.44 rows=1 width=125) (actual time=0.030..0.030 rows=0 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[]))
-> Index Scan using campaign_contact_reference_2463897_contact_reference_idx on campaign_contact_reference_2463897 campaign_contact_reference_master_2 (cost=0.41..8.43 rows=1 width=125) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[]))
-> Index Scan using campaign_contact_reference_2465950_contact_reference_idx on campaign_contact_reference_2465950 campaign_contact_reference_master_3 (cost=0.40..128.49 rows=1 width=34) (actual time=0.008..0.009 rows=0 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[]))
-> Index Scan using campaign_contact_reference_2480015_contact_reference_idx on campaign_contact_reference_2480015 campaign_contact_reference_master_4 (cost=0.42..8.44 rows=1 width=138) (actual time=0.039..0.039 rows=1 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[]))
-> Index Scan using campaign_contact_reference_2480016_contact_reference_idx on campaign_contact_reference_2480016 campaign_contact_reference_master_5 (cost=0.42..8.44 rows=1 width=140) (actual time=0.010..0.010 rows=0 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[]))
Planning Time: 0.627 ms
Execution Time: 0.176 ms
Partitioning by Application
Table
create table application_contact_reference_master (
application_id integer not null,
campaign_id integer not null,
contact_reference varchar(255) not null,
rti_program_runs program_run[],
ac_program_runs program_run[]
) partition by list (application_id);
create table alex_audience_data_investigation.application_contact_reference_398
partition of alex_audience_data_investigation.application_contact_reference_master for values in (398);
Index
create unique index on alex_audience_data_investigation.application_contact_reference_398 (contact_reference, campaign_id);
Copying Data
insert into alex_audience_data_investigation.application_contact_reference_398 (application_id, campaign_id, contact_reference, rti_program_runs, ac_program_runs)
select 398 as application_id, campaign_id, contact_reference, rti_program_runs, ac_program_runs
from audience_data.campaign_contact_reference_2480015;
Also, audience data for campaigns 2480016, 2463849, 2463897, 2465950 (the five campaigns from above).
Explain Analyze
inapp=> explain (analyze)
select campaign_id,
rti_program_runs,
ac_program_runs
from alex_audience_data_investigation.application_contact_reference_master
where application_id = 398 and contact_reference = 'cr:c3bdcb49eb841c89:eu-central-1' and campaign_id = any (array [2480015, 2480016, 2463849, 2463897, 2465950]);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using application_contact_reference_contact_reference_campaign_id_idx on application_contact_reference_398 application_contact_reference_master (cost=0.42..8.45 rows=1 width=130) (actual time=0.025..0.026 rows=1 loops=1)
Index Cond: ((contact_reference)::text = 'cr:c3bdcb49eb841c89:eu-central-1'::text)
Filter: ((application_id = 398) AND (campaign_id = ANY ('{2480015,2480016,2463849,2463897,2465950}'::integer[])))
Planning Time: 0.130 ms
Execution Time: 0.069 ms
The estimated total cost is significantly lower: 8 < 162 (query with five campaigns).
Queries per Second
130,252,959 queries in the last 7 days ⇒ 215 queries/second. Peak in the last days is ~350 queries/second.
Combined with the mean of campaigns in the audience membership query of 20 the database does 7k index scans/second.
When partitioning by application it would be just 215 index scans/second.