[postgis-devel] COSTS

Raúl Marín Rodríguez rmrodriguez at carto.com
Mon Mar 11 08:16:03 PDT 2019


> What led you to feel those costs were “too high”.?

In sort, benchmarks. The longer explanation below:

I followed a simple process to detect these cases. I have a benchmark
that runs 70-80 cases (different queries, different nº of geometries,
different geometry types) so I run it with the default config and then
disabling parallelism, and then checked which cases were faster when
parallelism was disabled.

Once I had those cases, I checked the plans and verified that they
were using parallel workers too soon. In both cases the geometries
were single points (simplify is a no-op and intersects a fast pip) so
reducing the cost made sense and removed the parallelism for those
plans. This change didn't have any impact in the benchmarks with lines
and polygons but ideally we'd need to test with an even bigger amount
of cases.

The queries and plans are:

* https://gist.github.com/Algunenano/ff13511295d0548f4ea6f705dc700d36

In this case the geometries are all points, and the amount of point
inside the bbox are 5104. Sequential plan takes 12ms, the parallel one
takes 33ms.

* https://gist.github.com/Algunenano/d8767f178f7bc3826f1fc8ddfd291e29

In this case, having a high cost ST_Intersects makes that part of the
plan parallel and the query takes 48 ms vs 27 without parallelism.
The geometries left after the filter are 4623 points.




On Mon, Mar 11, 2019 at 3:32 PM Paul Ramsey <pramsey at cleverelephant.ca> wrote:
>
> Yeah, I think we’re going to be on the horns of a dilemma… we practically have to trick the planner into doing what we want…
> One thing I remain unable to trick the planner into doing is a spatial join. Takes a cost of a million on ST_Intersects() to get a spatial join, on a pretty large could relations. Other stuff works, but it still takes quite high costs, like to get ST_Area to kick a parallel plan in takes a cost of 100.
>
> What led you to feel those costs were “too high”.?
>
> P
>
> > On Mar 11, 2019, at 3:45 AM, Raúl Marín Rodríguez <rmrodriguez at carto.com> wrote:
> >
> > Hi,
> >
> > Recently I've been doing some tests with CARTO's costs in Postgis and
> > found that some of the costs we had were too high in some cases (bit
> > point datasets):
> > - ST_Simplify: I reduced it from 512 to 64.
> > - ST_Intersects: I reduced it from 512 to 128.
> >
> > They are now setup with cost _COST_HIGH (10000) which seems really
> > excesive, but I'll wait until I can test with PG12 + trunk to test it.
> >
> > On Fri, Mar 8, 2019 at 10:54 PM Paul Ramsey <pramsey at cleverelephant.ca> wrote:
> >>
> >> So, FYI for those working against PG12, I have added in costing, and upped the costs for functions in PG12 substantially. So, let’s see if (a) something breaks around planning and/or (b) parallelism improves with these new costs?
> >>
> >> https://trac.osgeo.org/postgis/ticket/4344
> >>
> >> The knobs for moving costs around globally are in
> >>
> >> https://github.com/postgis/postgis/blob/239f2286989135b9a3156ce7789f45f09fcf0d3b/postgis/sqldefines.h.in#L24-L41
> >>
> >> P.
> >> _______________________________________________
> >> postgis-devel mailing list
> >> postgis-devel at lists.osgeo.org
> >> https://lists.osgeo.org/mailman/listinfo/postgis-devel
> >
> >
> >
> > --
> > Raúl Marín Rodríguez
> > carto.com
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at lists.osgeo.org
> > https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel



--
Raúl Marín Rodríguez
carto.com


More information about the postgis-devel mailing list