[postgis-users] Insert performance issues

Jonathan W. Lowe jlowe at giswebsite.com
Thu Oct 23 08:32:13 PDT 2003


Chris,

Appreciated your clear summary of index utilization.  A lingering
question is how the picture changes when Mapserver is involved.  For
example, if an assessor's parcel PostGIS layer in a Mapserver map file
contains 100,000 parcels, and the current zoom level only includes 50
parcels, will Mapserver automatically apply a filter for bounding box
overlap (the && operator) to the original SQL statement listed in the
map file?

If so, where will it put the statement, and how will it construct the
statement?  For instance, what would happen to this SQL statement (in
Mapserver DATA parameter format) when the view was zoomed in close, as
described above?  The query links home listings for the past two weeks
to their parcel geometry using a common key (listing.uid and
parcel.apn).

  DATA "the_geom from (select Centroid(p.the_geom) as the_geom, p.oid as
oid from parcel p, listing l where l.list_date <= ((current_date -
interval '1 days') - interval '15 days') and l.uid = p.apn) as
listed_homes using unique oid using srid=26943"

What I'm wondering is if Mapserver will force use of an overlap (&&)
that uses the index on parcel.the_geom, but prevents use of another
index on parcel.apn.  Or, does the SQL within the parenthesis run
independently of the surrounding "the_geom from (...) as listed_homes
using unique oid using srid=26943" ?

I ask in order to simulate the SQL for zoomed in Mapserver views for
EXPLAIN tests on the PostgreSQL command line.

Thanks,
Jonathan

On Mon, 2003-10-20 at 11:11, chodgson at refractions.net wrote:
> Only one index can be used for each table that is involved in a query. I 
> believe this is true of all databases, except perhaps in unusual circumstances. 
> Obviously, a different index may be used for the same table, for a different 
> query. 
> 
> If you have a query that makes use of two fields to identify records 
> from a table, then it may be advantageous to build a single index on both 
> fields. For example, if you are running this query:
> 
> select foo from the_table where owner_id = 1 and type_id = 5;
> 
> Then you will want an index on at least one of owner_id OR type_id, OR both. If 
> you have two separate indexes on (owner_id) and (type_id) then the database will
> use the one that it thinks will reduce the result set more dramatically; if 
> there are 1000 different owner_ids and only 5 different type_ids, it will use 
> the index on owner_id, and vice versa. So, if you already know that this is the 
> case, only build the one index (assuming you don't need an index on the other 
> field for another query) However, if you know that the "cardinality" (the 
> number of unique entries) of both owner_id and type_id are relatively large, 
> then one index on _both_ columns is appropriate. ie.
> 
> create index the_table_owner_type_idx on the_table( owner_id, type_id );
> 
> The order of the two columns in the index definition DOES matter. This is 
> especially true if you are doing greater than/less than comparisons on your 
> indexed columns, or if you want to make use of the same index in other queries. 
> For example, the query:
> 
> Select foo from the_table where owner_id = 1;
> 
> can make use of the (owner_id, type_id) index as if it were just an (owner_id) 
> index. However, it can't use a (type_id, owner_id) index at all.
> 
> There always seems to be a lot of confusion surrounding the use of indexes. I 
> highly recommend looking into the "explain" syntax, and how to read its output. 
> Everything you need to know about how your indexes are (or are not) being used 
> can be found using explain.
> 
> Chris
> 
> 
> Quoting "Obe, Regina     DND\\MIS" <robe.dnd at ci.boston.ma.us>:
> 
> > That seems like a fair number.  I would try getting rid of them except for
> > gist and slowly add them back based on query performance.  It will really
> > slow down your insert performance having so many indexed.  Use Explain for
> > that.
> > 
> > I'm not quite sure how Postgresql handles queries - but from experience with
> > other databases -  a system can use only one or 2 at any given time.
> > 
> > You should do an analysis of the most common queries you will be using -
> > particularly the WHERE CLAUSE.
> > 
> > If you find you need all of them, I would consider dropping them before
> > insert and adding them back after the insert is completed.
> > 
> > -----Original Message-----
> > From: Nicolas RIBOT [mailto:nri at scot.cnes.fr]
> > Sent: Monday, October 20, 2003 11:44 AM
> > To: PostGIS Users Discussion
> > Subject: RE: [postgis-users] Insert performance issues
> > 
> > 
> > > You said you have all the fields indexed involved in the
> > > query.  Exactly how
> > > many fields do you have indexed?
> > >
> > 
> > 7. The table Structure is the following:
> > 
> > create table clcv1 (
> >     gid         serial primary key      ,
> >     AREA        float8                  ,
> >     PERIMETER   float8                  ,
> >     CLCV1_      int4                    ,
> >     CLCV1_ID    int4                    ,
> >     ID_CLC_INI  int4                    ,
> >     CLC_V0      int4                    ,
> >     CLC_V1      int4                    ,
> >     MARQUEUR    int4                    ,
> >     BLOC        varchar(4)              ,
> >     EDIT_ST     varchar(5)              ,
> >     MODIF_ST    varchar (6)             ,
> >     ID_BLOC     int4                    ,
> >     ISMODIF     int4          default 0
> > );
> > CREATE INDEX clcv1_spi   ON clcv1    USING GIST (the_geom
> > GIST_GEOMETRY_OPS);
> > CREATE index clcv1_clcv1__idx on clcv1 (clcv1_);
> > CREATE index clcv1_clc_v1_idx on clcv1 (clc_v1);
> > CREATE index clcv1_marqueur_idx on clcv1 (marqueur);
> > CREATE index clcv1_bloc_idx on clcv1 (bloc);
> > CREATE index clcv1_edit_st_idx on clcv1 (edit_st);
> > CREATE index clcv1_modif_st_idx on clcv1 (modif_st);
> > 
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > 
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > 
> 
> 
> 
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 





More information about the postgis-users mailing list