[postgis-devel] Issues about the GSoC project

Mon Jul 26 14:01:41 PDT 2021

On Mon, Jul 26, 2021 at 10:44:42PM +0200, Raúl Marín wrote:
> Hi,
> 
> As Paul is mentioning, it appears that you are assuming that having lots of
> shared buffer hits is good but it isn't. A shared buffer hit meant that you
> needed to read a page and it happened to be cached in memory; but indexes
> are useful because they avoid the need of reading unnecessary pages (in
> memory or in disk), so an ideal sort would be one that put the data you are
> looking in the same pages (so all the rows in the page are useful) and the
> ideal index would be one that knew which pages as fast as possible (also
> reading as few index pages as possible).

I'm not following along closely, but I suggest to look at whether the index is
clustered or not.

SELECT correlation FROM pg_stats WHERE attname=.. AND tablename=..

If correlation is low, an index scan may touch many pages of the table, even if
it returns only a fraction of its tuples.  In addition to reading the heap more
randomly than sequentually.

-- 
Justin