[postgis-users]
Performance tuning PostGIS and understanding the explain statement
Benjamin Wragg
bwragg@tpg.com.au
Fri Jun 13 00:25:56 2003
This is a multi-part message in MIME format.
---------------------- multipart/alternative attachment
Hi list,
Before I start, please excuse what might be a trivial question. It's
regarding the explain statement. I've never really understood how to
read it correctly. I know that it shows the query execution plan, but
how am I meant to understand the output? For example could someone give
me some pointers as to what the execution plan is on following queries?
EXPLAIN
explain SELECT feature.id,feature.name
FROM feature,region
WHERE region.id=28
AND (feature.the_geom && region.the_geom AND distance(feature.the_geom,
region.the_geom)=0);
NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..9.04 rows=44 width=80)
-> Index Scan using region_pkey on region (cost=0.00..3.01 rows=1
width=32)
-> Index Scan using feature_the_geom_idx on feature (cost=0.00..6.01
rows=1 width=48)
EXPLAIN ANALYZE
explain analyze SELECT feature.id,feature.name
FROM feature,region
WHERE region.id=28
AND (feature.the_geom && region.the_geom AND distance(feature.the_geom,
region.the_geom)=0);
NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..9.04 rows=44 width=80) (actual
time=114.00..30210.00 rows=4992 loops=1)
-> Index Scan using region_pkey on region (cost=0.00..3.01 rows=1
width=32) (actual time=0.00..0.00 rows=1 loops=1)
-> Index Scan using feature_the_geom_idx on feature (cost=0.00..6.01
rows=1 width=48) (actual time=105.00..21179.00 rows=5133 loops=1)
Total runtime: 30238.00 msec
Is this saying that it uses a loop and each loop it gets the region and
compares it to a feature? Is the time killing in the comparison or the
features?
If I change the query slightly by removing some brackets and adding an
order by I get the following results:
EXPLAIN
explain SELECT feature.id,feature.name
FROM feature,region
WHERE feature.the_geom && region.the_geom AND distance(feature.the_geom,
region.the_geom)=0
AND region.id=28
ORDER BY feature.name;
NOTICE: QUERY PLAN:
Sort (cost=10.25..10.25 rows=44 width=80)
-> Nested Loop (cost=0.00..9.04 rows=44 width=80)
-> Index Scan using region_pkey on region (cost=0.00..3.01
rows=1 width=32)
-> Index Scan using feature_the_geom_idx on feature
(cost=0.00..6.01 rows=1 width=48)
EXPLAIN ANALYZE
explain analyze SELECT feature.id,feature.name
FROM feature,region
WHERE feature.the_geom && region.the_geom AND distance(feature.the_geom,
region.the_geom)=0
AND region.id=28
ORDER BY feature.name;
NOTICE: QUERY PLAN:
Sort (cost=10.25..10.25 rows=44 width=80) (actual
time=30720.00..30721.00 rows=4992 loops=1)
-> Nested Loop (cost=0.00..9.04 rows=44 width=80) (actual
time=113.00..30678.00 rows=4992 loops=1)
-> Index Scan using region_pkey on region (cost=0.00..3.01
rows=1 width=32) (actual time=0.00..0.00 rows=1 loops=1)
-> Index Scan using feature_the_geom_idx on feature
(cost=0.00..6.01 rows=1 width=48) (actual time=105.00..21584.00
rows=5133 loops=1)
Total runtime: 30725.00 msec
What is the order of things here? Is it first of all sorting all the
features by there name then looping through each feature comparing it to
the region?
Thanks,
Benjamin
---------------------- multipart/alternative attachment
An HTML attachment was scrubbed...
URL: http://offsite.refractions.net/pipermail/postgis-users/attachments/20030613/b5078b1a/attachment.htm
---------------------- multipart/alternative attachment--