[postgis-users] Problem with Nested Loop in a query with postgis

Regina Obe lr at pcorp.us
Thu Jul 20 21:37:48 PDT 2023


Was this upgrading pg10 from 2.5.3 to 3.0.7?

 

I’m more interested in seeing this run in newer PostgreSQL from 12 on.

There was a lot of stuff that changed in PostgreSQL 12 that affects planner behavior.

 

Any chance you can recompare your PG 14  3.1.7 vs. 3.1.8? 

 

The change detailed between 3.1.7 and 3.1.8 did not make any sense to us as nothing in the change log between those two versions would account for the issue you are describing here.

 

Or was that a typo and the issue was between 3.0.7 and 3.1.8?  Planner changes between 3.0.7 and 3.1.8 is more likely.

 

As mentioned, it would really help if when you state your versions, you provide output of

 

SELECT postgis_full_version(), version();

 

Even micro differences between PostgreSQL versions have on rare occasions affected planner behavior too.

 

Thanks,

Regina

 

 

From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Anderson Mallmann
Sent: Thursday, July 20, 2023 5:13 PM
To: PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Problem with Nested Loop in a query with postgis

 

I restored a backup of this database in pg10 2.5.3 and updated postgis to 3.0.7, at first the query used the Merge Semi Join and the performance was fine, so I force to use Nested Loop and perforce was dropped.
So i think our problem was that postgres changed the plan for this query and st_dwithin(geography, geography) in loop didn't perform well, but I still don't understand why postgres is choosing to run this query with Nested Loop.

I did other tests with a subset of the data and with a similar query using nested loop the performance was not good in 2.5 or 3.0.
I uploaded the subset to this repository https://github.com/aamallmann/postgis-region-db if you want to test.

 

Em seg., 17 de jul. de 2023 às 16:40, Regina Obe <lr at pcorp.us <mailto:lr at pcorp.us> > escreveu:

I looked at our list of items between 3.1.7 and 3.1.8 

 

https://git.osgeo.org/gitea/postgis/postgis/src/branch/stable-3.1/NEWS

 

and can’t find any that would cause what you describe below.

It’s possible whatever was changed we didn’t consider strong enough to put in the news.

 

I haven’t checked the git repo yet to see what changed.

 

Usually these switches in plans are often triggered by changes in costs of the functions or just data and it is possible we adjusted some of the costs on these functions.

 

I was thinking the 10 to 14 switch might have been caused by the function instrumentation changes.

But that wouldn’t explain your experiencing the same issue on 14 3.1.7 vs 14 3.1.8

 

If you still have both running, can you give us the output 

 

SELECT postgis_full_version(), version();

 

So we can confirm we are truly comparing only a difference in PostGIS version and not the libraries used or PostgreSQL minor version.

 

Thanks,

Regina

 

 

From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org <mailto:postgis-users-bounces at lists.osgeo.org> ] On Behalf Of Anderson Mallmann
Sent: Monday, July 17, 2023 10:23 AM
To: postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> ; lviecelli199 at gmail.com <mailto:lviecelli199 at gmail.com> ; torres.glauco at gmail.com <mailto:torres.glauco at gmail.com> 
Subject: [postgis-users] Problem with Nested Loop in a query with postgis

 

Hi,

Here at the company I work for, we upgraded a postgres database from version 10 to 14 and postgis 2.5.3 to 3.3.3 and in a specific query we noticed a huge performance decrease between these versions. This query was running less than 1s and now it takes more than 50s.
Now we are trying to figure out what the real problem is and in which version this started to happen. Some of our discoveries so far:

1) The query runs fine until postgis 3.1.7
2) In postgis 3.1.8 the query execution changed from Merge Semi Join to Nested Loop Semi Join.
3) Running this query on postgis >= 3.1.8 disabling nested loop with “set enable_nestloop = false” the query runs as before without any problem.

We don’t know why the plan changed in postgis >= 3.1.8 to use Nested Loop, but even in earlier versions if any query like this uses nested loop, the performance drops.

Query:
SELECT pols.pol_id, pols.type, kml, json_build_object('type', 'Feature', 'geometry', st_asgeojson(pols.geom)::jsonb) as geoJson, pols.status pol_status, pols.owner_id pol_owner, pols.details, pols.creation_date as pol_creation_date, pols.country as country, pols.state as state, pols.city as city, pols.label as label FROM polygons pols WHERE pols.owner_id is null AND pols.status = 'ACTIVE' AND pols.type in ('REGION') AND pols.details->>'state' in (SELECT s.state FROM states s WHERE ST_DWITHIN(Geography(ST_Transform(s.geom,4326)), ST_GeographyFromText('POINT(-43.113826 -22.9022)'),'15000')) AND (ST_DWithin(ST_GeographyFromText('POINT(-43.113826 -22.9022)'), Geography(pols.geom),'15000')) AND (ST_MaxDistance(ST_GeomFromText('POINT(-43.113826 -22.9022)', 4326), pols.geom) * 111195) <= '15000'

Here are the explains for this query:

pg 14 - postgis 3.1.7
 Merge Semi Join  (cost=15.83..16.04 rows=1 width=263) (actual time=529.041..586.241 rows=416 loops=1)
   Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label
   Merge Cond: (((pols.details ->> 'state'::text)) = (s.state)::text)
   Buffers: shared hit=3193
   ->  Sort  (cost=4.49..4.50 rows=1 width=4656) (actual time=244.333..244.367 rows=416 loops=1)
         Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, ((pols.details ->> 'state'::text))
         Sort Key: ((pols.details ->> 'state'::text))
         Sort Method: quicksort  Memory: 1314kB
         Buffers: shared hit=2131
         ->  Index Scan using polygons_geography_state_idx on region.polygons pols  (cost=0.28..4.48 rows=1 width=4656) (actual time=10.207..243.079 rows=416 loops=1)
               Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, (pols.details ->> 'state'::text)
               Index Cond: (geography(pols.geom) && '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography)
               Filter: (('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography && _st_expand(geography(pols.geom), '15000'::double precision)) AND _st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, geography(pols.geom), '15000'::double precision, true) AND ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry, st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double precision))
               Rows Removed by Filter: 395
               Buffers: shared hit=2125
   ->  Sort  (cost=11.34..11.36 rows=9 width=3) (actual time=284.570..284.570 rows=1 loops=1)
         Output: s.state
         Sort Key: s.state
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=1062
         ->  Seq Scan on region.states s  (cost=0.00..11.19 rows=9 width=3) (actual time=180.759..284.564 rows=1 loops=1)
               Output: s.state
               Filter: st_dwithin(geography(st_transform(s.geom, 4326)), '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision, true)
               Rows Removed by Filter: 26
               Buffers: shared hit=1062
 Query Identifier: -6178717669455780324
 Planning:
   Buffers: shared hit=481
 Planning Time: 31.139 ms
 Execution Time: 586.869 ms

-----------

pg 14 - postgis 3.1.8
 Nested Loop Semi Join  (cost=0.67..111.50 rows=1 width=263) (actual time=179.622..59918.477 rows=416 loops=1)
   Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label
   Join Filter: ((pols.details ->> 'state'::text) = (s.state)::text)
   Buffers: shared hit=72116 read=1075
   ->  Index Scan using polygons_geography_state_idx on region.polygons pols  (cost=0.41..55.77 rows=1 width=4656) (actual time=9.477..221.758 rows=416 loops=1)
         Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, pols.city, pols.label, pols.last_updated
         Index Cond: (geography(pols.geom) && _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision))
         Filter: (st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, geography(pols.geom), '15000'::double precision, true) AND ((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry, st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double precision))
         Rows Removed by Filter: 395
         Buffers: shared hit=269 read=692
   ->  Index Scan using states_geography_idx on region.states s  (cost=0.26..54.37 rows=1 width=3) (actual time=142.676..142.676 rows=1 loops=416)
         Output: s.state_code, s.state, s.country, s.geom
         Index Cond: (geography(st_transform(s.geom, 4326)) && _st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision))
         Filter: st_dwithin(geography(st_transform(s.geom, 4326)), '0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, '15000'::double precision, true)
         Rows Removed by Filter: 1
         Buffers: shared hit=70988 read=171
 Query Identifier: -6178717669455780324
 Planning:
   Buffers: shared hit=401 read=45
 Planning Time: 30.729 ms
 Execution Time: 59919.343 ms

Regards,
Anderson M.

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> 
https://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20230721/0e673cf1/attachment.htm>


More information about the postgis-users mailing list