pg_hint_plan -- controls execution plan with hinting phrases in comment of special form.
PostgreSQL uses cost based optimizer, which utilizes data statistics, not static rules. The planner (optimizer) estimates costs of each possible execution plan for a SQL statement, and then the execution plan with the lowest cost is eventually executed. The planner does its best to select the best execution plan, but it is not perfect, since it doesn't know all properties of the data, for example, correlation between columns.
pg_hint_plan makes it possible to tweak execution plans using so-called "hints", which are simple descriptions in the SQL comment of special form.
pg_hint_plan reads hinting phrases in a comment of special form given with the target SQL statement. The special form is beginning by the character sequence "/*+" and ends with "*/". Hint phrases consist of hint name and following parameters enclosed by parentheses and delimited by spaces. Each hinting phrases can be delimited by new lines for readability.
In the example below , hash join is selected as the join method and scanning pgbench_accounts by sequential scan method.
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# EXPLAIN SELECT * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows) postgres=#
Hints are described in a comment in a special form in the above section. This is inconvenient in the case where queries cannot be edited. In the case hints can be placed in a special table named "hint_plan.hints". This feature requires compute_query_id to be "on" or "auto". The table consists of the following columns.
column | description |
---|---|
id | Unique number to identify a row for a hint. This column is filled automatically by sequence. |
norm_query_string | A pattern matches to the query to be hinted. Constants in the query have to be replace with '?' as in the following example. White space is significant in the pattern. |
application_name | The value of application_name of sessions to apply the hint. The hint in the example below applies to sessions connected from psql. An empty string means sessions of any application_name. |
hints | Hint phrase. This must be a series of hints excluding surrounding comment marks. |
The following example shows how to operate with the hint table.
postgres=# INSERT INTO hint_plan.hints(norm_query_string, application_name, hints) postgres-# VALUES ( postgres(# 'EXPLAIN (COSTS false) SELECT * FROM t1 WHERE t1.id = ?;', postgres(# '', postgres(# 'SeqScan(t1)' postgres(# ); INSERT 0 1 postgres=# UPDATE hint_plan.hints postgres-# SET hints = 'IndexScan(t1)' postgres-# WHERE id = 1; UPDATE 1 postgres=# DELETE FROM hint_plan.hints postgres-# WHERE id = 1; DELETE 1 postgres=#
The hint table is owned by the creator user and having the default privileges at the time of creation during CREATE EXTENSION. Table hints are prioritized then comment hits.
Hinting phrases are classified into six types based on what kind of object and how they can affect planning. Scanning methods, join methods, joining order, row number correction, parallel query and GUC setting. You will see the lists of hint phrases of each type in Hint list.
Scan method hints enforce specific scanning method on the target table. pg_hint_plan recognizes the target table by alias names if any. They are 'SeqScan' , 'IndexScan' and so on in this kind of hint.
Scan hints are effective on ordinary tables, inheritance tables, UNLOGGED tables, temporary tables and system catalogs. External(foreign) tables, table functions, VALUES clause, CTEs, views and subqueries are not affected.
postgres=# /*+ postgres*# SeqScan(t1) postgres*# IndexScan(t2 t2_pkey) postgres*# */ postgres-# SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key);
Join method hints enforce the join methods of the joins involving specified tables.
This can affect on joins only on ordinary tables, inheritance tables, UNLOGGED tables, temporary tables, external (foreign) tables, system catalogs, table functions, VALUES command results and CTEs are allowed to be in the parameter list. But joins on views and sub query are not affected.
This hint "Leading" enforces the order of join on two or more tables. There are two ways of enforcing. One is enforcing specific order of joining but not restricting direction at each join level. Another enforces join direction additionally. Details are seen in the hint list table.
postgres=# /*+ postgres*# NestLoop(t1 t2) postgres*# MergeJoin(t1 t2 t3) postgres*# Leading(t1 t2 t3) postgres*# */ postgres-# SELECT * FROM table1 t1 postgres-# JOIN table table2 t2 ON (t1.key = t2.key) postgres-# JOIN table table3 t3 ON (t2.key = t3.key);
This hint "Memoize" and "NoMemoize" allows and disallows a join to memoize the inner result. In the following example, The NoMemoize hint inhibits the topmost hash join from memoizing the result of table a.
postgres=# /*+ NoMemoize(a b) */ EXPLAIN SELECT * FROM a, b WHERE a.val = b.val; QUERY PLAN -------------------------------------------------------------------- Hash Join (cost=270.00..1412.50 rows=100000 width=16) Hash Cond: (b.val = a.val) -> Seq Scan on b (cost=0.00..15.00 rows=1000 width=8) -> Hash (cost=145.00..145.00 rows=10000 width=8) -> Seq Scan on a (cost=0.00..145.00 rows=10000 width=8)
This hint "Rows" corrects row number misestimation of joins that comes from restrictions of the planner.
postgres=# /*+ Rows(a b #10) */ SELECT... ; Sets rows of join result to 10 postgres=# /*+ Rows(a b +10) */ SELECT... ; Increments row number by 10 postgres=# /*+ Rows(a b -10) */ SELECT... ; Subtracts 10 from the row number. postgres=# /*+ Rows(a b *10) */ SELECT... ; Makes the number 10 times larger.
This hint "Parallel" enforces parallel execution configuration on scans. The third parameter specifies the strength of enforcement. "soft" means that pg_hint_plan only changes max_parallel_worker_per_gather and leave all others to planner. "hard" changes other planner parameters so as to forcibly apply the number. This can affect on ordinary tables, inheritance parents, unlogged tables and system catalogues. External tables, table functions, values clause, CTEs, views and subqueries are not affected. Internal tables of a view can be specified by its real name/alias as the target object. The following example shows that the query is enforced differently on each table.
postgres=# explain /*+ Parallel(c1 3 hard) Parallel(c2 5 hard) */ SELECT c2.a FROM c1 JOIN c2 ON (c1.a = c2.a); QUERY PLAN ------------------------------------------------------------------------------- Hash Join (cost=2.86..11406.38 rows=101 width=4) Hash Cond: (c1.a = c2.a) -> Gather (cost=0.00..7652.13 rows=1000101 width=4) Workers Planned: 3 -> Parallel Seq Scan on c1 (cost=0.00..7652.13 rows=322613 width=4) -> Hash (cost=1.59..1.59 rows=101 width=4) -> Gather (cost=0.00..1.59 rows=101 width=4) Workers Planned: 5 -> Parallel Seq Scan on c2 (cost=0.00..1.59 rows=59 width=4) postgres=# EXPLAIN /*+ Parallel(tl 5 hard) */ SELECT sum(a) FROM tl; QUERY PLAN ----------------------------------------------------------------------------------- Finalize Aggregate (cost=693.02..693.03 rows=1 width=8) -> Gather (cost=693.00..693.01 rows=5 width=8) Workers Planned: 5 -> Partial Aggregate (cost=693.00..693.01 rows=1 width=8) -> Parallel Seq Scan on tl (cost=0.00..643.00 rows=20000 width=4)
'Set' hint changes GUC parameters just while planning. GUC parameter shown in Query Planning can have the expected effects on planning unless any other hint conflicts with the planner method configuration parameters. The last one among hints on the same GUC parameter makes effect. GUC parameters for pg_hint_plan are also settable by this hint but it won't work as your expectation. See Restrictions for details.
postgres=# /*+ Set(random_page_cost 2.0) */
postgres-# SELECT * FROM table1 t1 WHERE key = 'value';
...
GUC parameters below affect the behavior of pg_hint_plan.
Parameter name | description | Default |
---|---|---|
pg_hint_plan.enable_hint | True enables pg_hint_plan. | on |
pg_hint_plan.enable_hint_table | True enables hinting by table. true or false. | off |
pg_hint_plan.parse_messages | Specifies the log level of hint parse error. Valid values are error, warning, notice, info, log, debug | INFO |
pg_hint_plan.debug_print | Controls debug print and verbosity. Valid values are off, on, detailed and verbose. | off |
pg_hint_plan.message_level | Specifies message level of debug print. Valid values are error, warning, notice, info, log, debug | LOG |
pg_hint_plan.hints_anywhere | If it is on, pg_hint_plan reads hints ignoring SQL syntax. This allows to hints to be placed anywhere in a query but be cautious of false reads. | off |
Simply run "make" in the top of the source tree, then "make install" as appropriate user. The PATH environment variable should be set properly for the target PostgreSQL for this process.
$ tar xzvf pg_hint_plan-1.x.x.tar.gz $ cd pg_hint_plan-1.x.x $ make $ su # make install
Basically pg_hint_plan does not requires CREATE EXTENSION. Simply loading it by LOAD command will activate it and of course you can load it globally by setting shared_preload_libraries in postgresql.conf. Or you might be interested in ALTER USER SET/ALTER DATABASE SET for automatic loading for specific sessions.
postgres=# LOAD 'pg_hint_plan'; LOAD postgres=#
Do CREATE EXTENSION and SET pg_hint_plan.enable_hint_tables TO on if you are planning to hint tables.
"make uninstall" in the top directory of source tree will uninstall the installed files if you installed from the source tree and it is left available.
$ cd pg_hint_plan-1.x.x $ su # make uninstall
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# /*+ IndexScan(a) */ postgres-# EXPLAIN SELECT /*+ MergeJoin(a b) */ * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows) postgres=#
postgres=# CREATE FUNCTION hints_func(integer) RETURNS integer AS $$ postgres$# DECLARE postgres$# id integer; postgres$# cnt integer; postgres$# BEGIN postgres$# SELECT /*+ NoIndexScan(a) */ aid postgres$# INTO id FROM pgbench_accounts a WHERE aid = $1; postgres$# SELECT /*+ SeqScan(a) */ count(*) postgres$# INTO cnt FROM pgbench_accounts a; postgres$# RETURN id + cnt; postgres$# END; postgres$# $$ LANGUAGE plpgsql;
postgres=# /*+ HashJoin(t1 t1) */ postgres-# EXPLAIN SELECT * FROM s1.t1 postgres-# JOIN public.t1 ON (s1.t1.id=public.t1.id); INFO: hint syntax error at or near "HashJoin(t1 t1)" DETAIL: Relation name "t1" is ambiguous. ... postgres=# /*+ HashJoin(pt st) */ postgres-# EXPLAIN SELECT * FROM s1.t1 st postgres-# JOIN public.t1 pt ON (st.id=pt.id); QUERY PLAN --------------------------------------------------------------------- Hash Join (cost=64.00..1112.00 rows=28800 width=8) Hash Cond: (st.id = pt.id) -> Seq Scan on t1 st (cost=0.00..34.00 rows=2400 width=4) -> Hash (cost=34.00..34.00 rows=2400 width=4) -> Seq Scan on t1 pt (cost=0.00..34.00 rows=2400 width=4)
postgres=# CREATE VIEW v1 AS SELECT * FROM t2; postgres=# EXPLAIN /*+ HashJoin(t1 v1) */ SELECT * FROM t1 JOIN v1 ON (c1.a = v1.a); QUERY PLAN ------------------------------------------------------------------ Hash Join (cost=3.27..18181.67 rows=101 width=8) Hash Cond: (t1.a = t2.a) -> Seq Scan on t1 (cost=0.00..14427.01 rows=1000101 width=4) -> Hash (cost=2.01..2.01 rows=101 width=4) -> Seq Scan on t2 (cost=0.00..2.01 rows=101 width=4)
postgres=# /*+ MergeJoin(*VALUES*_1 *VALUES*) */ EXPLAIN SELECT * FROM (VALUES (1, 1), (2, 2)) v (a, b) JOIN (VALUES (1, 5), (2, 8), (3, 4)) w (a, c) ON v.a = w.a; INFO: pg_hint_plan: hint syntax error at or near "MergeJoin(*VALUES*_1 *VALUES*) " DETAIL: Relation name "*VALUES*" is ambiguous. QUERY PLAN ------------------------------------------------------------------------- Hash Join (cost=0.05..0.12 rows=2 width=16) Hash Cond: ("*VALUES*_1".column1 = "*VALUES*".column1) -> Values Scan on "*VALUES*_1" (cost=0.00..0.04 rows=3 width=8) -> Hash (cost=0.03..0.03 rows=2 width=8) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8)
Subqueries in the following context occasionally can be hinted using the name "ANY_subquery".
IN (SELECT ... {LIMIT | OFFSET ...} ...) = ANY (SELECT ... {LIMIT | OFFSET ...} ...) = SOME (SELECT ... {LIMIT | OFFSET ...} ...)
For these syntaxes, planner internally assigns the name to the subquery when planning joins on tables including it, so join hints are applicable on such joins using the implicit name as the following.
postgres=# /*+HashJoin(a1 ANY_subquery)*/ postgres=# EXPLAIN SELECT * postgres=# FROM pgbench_accounts a1 postgres=# WHERE aid IN (SELECT bid FROM pgbench_accounts a2 LIMIT 10); QUERY PLAN --------------------------------------------------------------------------------------------- Hash Semi Join (cost=0.49..2903.00 rows=1 width=97) Hash Cond: (a1.aid = a2.bid) -> Seq Scan on pgbench_accounts a1 (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=0.36..0.36 rows=10 width=4) -> Limit (cost=0.00..0.26 rows=10 width=4) -> Seq Scan on pgbench_accounts a2 (cost=0.00..2640.00 rows=100000 width=4)
pg_hint_plan parameters change the behavior of itself so some parameters doesn't work as expected.
pg_hint_plan stops parsing on any error and uses hints already parsed on the most cases. Followings are the typical errors.