[postgis-users] Performace problems with 250.000.000 points

Brent Wood pcreso at pcreso.com
Wed Jul 25 14:07:17 PDT 2007


--- "Boehm, Andreas" <boehm at rif.fuedo.de> wrote:

> Hi Mark, Patricio, Brent and Paul (and all the others of course)
> 
> thanks for your answers.
> 
> --- Mark wrote: 
> > SELECT * FROM trees WHERE geom && (SELECT geom FROM 
> > land_parcel WHERE lp_id = 123456);
> There's quite no difference. Please refer to the explain analyses at the
> end of this mail.
> 
> --- Patricio wrote:
> > Aplicate case box3d
> Should I realy add bounding boxes for points? Could it be faster testing
> the box than testing the point? Or does it just blow up the trees table?
> 
> --- Brent wrote:
> > speed up by tiling your trees into (for example) 10000 "boxes"
> I have not implemented it yet. But isn't it the way the geometry index
> is for? The index should also reduce the trees to be tested.

As I understand it, the geometry index is essentially a bounding box of the
feature, so has limited advantages on point features compared with line &
polygon ones. But I could be wrong ablout this :-) Tiling the trees allows an
integer index to quickly restrict the tree geom index search to only a few tens
of thousands of trees.

I have got orders of magnitude performance boosts from this approach, both via
tiling & table partitioning. I believe there have been indexing performance
enhancements to very recent versions, which I have not tested. My datasets max
out at a few hundred million records, so I can't guarantee they'll scale up to
yours, but it seems likely.


Cheers,

  Brent

> 
> --- Paul wrote:
> > use
> > shared_buffers = 1024MB
> > effective_cache_size = 1024MB
> > random_page_cost = 3
> It sounds silly, but these parameters even slow it down. With this
> configuration Posgres uses much more memory. Starting your query you can
> see the mem usage raising. But at the end the results aren't faster than
> the original settings.
> I changed the testing a bit. Now I read 100 land_parcels uniformly
> distributed over the database. For each land_parcel I additionally read
> the next 10 land_parcels. These neighbours return very quickly, off
> course.
> 
> Thanks
> Andreas
> 
> ---
> 
> P.S. Here are the explain analyses:
> 
> EXPLAIN ANALYZE 
> SELECT * FROM land_parcel WHERE lp_id = 123456;
> 
> Index Scan using land_parcel_pkey on land_parcel  (cost=0.00..9.25
> rows=1 width=342) (actual time=0.016..0.017 rows=1 loops=1)
>   Index Cond: (lp_id = 123456)
> Total runtime: 0.045 ms
> 
> ---
> 
> EXPLAIN ANALYSE 
> SELECT x(geom), y(geom)
> FROM trees 
> WHERE geom && (SELECT geom FROM land_parcel WHERE lp_id = 123456);
> 
> Bitmap Heap Scan on trees  (cost=108.73..5063.01 rows=1250 width=21)
> (actual time=716.319..717.002 rows=18 loops=1)
>   Filter: (geom && $0)
>   InitPlan
>     ->  Index Scan using land_parcel_pkey on land_parcel
> (cost=0.00..9.25 rows=1 width=317) (actual time=0.096..0.098 rows=1
> loops=1)
>           Index Cond: (lp_id = 123456)
>   ->  Bitmap Index Scan on trees_index  (cost=0.00..99.16 rows=1250
> width=0) (actual time=716.176..716.176 rows=18 loops=1)
>         Index Cond: (geom && $0)
> Total runtime: 717.082 ms
> 
> ---
> 
> EXPLAIN ANALYZE 
> select x(t.geom), y(t.geom) 
> from land_parcel as l, trees as t 
> where (l.lp_id = 123456) 
> and (l.geom && t.geom) and Contains(l.geom, t.geom)
> 
> Nested Loop  (cost=99.47..5072.48 rows=19 width=21) (actual
> time=722.931..723.639 rows=16 loops=1)
>   Join Filter: contains(l.geom, t.geom)
>   ->  Index Scan using land_parcel_pkey on land_parcel l
> (cost=0.00..9.25 rows=1 width=317) (actual time=0.205..0.208 rows=1
> loops=1)
>         Index Cond: (lp_id = 123456)
>   ->  Bitmap Heap Scan on trees t  (cost=99.47..5044.38 rows=1250
> width=21) (actual time=722.541..723.197 rows=18 loops=1)
>         Filter: (l.geom && t.geom)
>         ->  Bitmap Index Scan on trees_index  (cost=0.00..99.16
> rows=1250 width=0) (actual time=722.417..722.417 rows=18 loops=1)
>               Index Cond: (l.geom && t.geom)
> Total runtime: 723.759 ms
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 




More information about the postgis-users mailing list