[postgis-users] OT: Generic postgresql Q about LARGE number of tables

Stephen Woodbridge woodbri at swoodbridge.com
Sun Nov 28 13:51:23 PST 2010


Thank you for all your ideas and suggestions. This has been very helpful 
and also applies to GIS tables because I have a similar problem there 
where I have a large set of shapefiles and need to work on them as a 
layer, like streets or POI, etc but also need to keep them separated as 
shapefiles.

Leo, I have a question below ...

On 11/28/2010 12:13 PM, Paragon Corporation wrote:
>
>> As I noted above, I am not sure if 8000+ tables itself is a problem. My
> sense is that that should actually work fine. It is 8000+ *inherited*
> tables that are a problem. This sense was concurred with on Pg list by
> some of the Pg gurus.
>
>
> I'll both disagree and agree with Puneet here.  The cost of the inherited
> tables approach is really in the planning.   After 500 tables in a hierarchy
> trying to do a query from the top bottom becomes
> unusuable as the cost of query planning becomes really high. Though some of
> this has imporoved in 9.0 and 9.1.
>
> However -- there is really not much of a difference between 8000 tables
> non-inherited verses 8000 inherited if you do a self-join against two child
> tables.  There is no  inheritance planning going on there.
>
> The inheritance thing we generally stratify though given your dataset, you
> may not want to stratisfy of may not stratisfy by date range.
>
> You might actually want to split your data by symbol ranges rather than
> individual symbols
>
> So for example put all the A-Cs in one table
> Ds-Fs in another
>
> And so on.
>
> Then you are not talking about 8000 tables but say 10 manageable inherited
> tables.  Then your check constraints would be by symbol ranges instead of
> individual symbols.
>
> So you still get a fairly speedy self-join
>
> E.g if you want to query MSFT you could do
>
> SELECT M.stock, J.stock
> FROM m_symbols As M INNER JOIN o_symbols As O ON M.ticker_date =
> S.ticker_date
> WHERE M.stock = 'MSFT'  AND O.stock = 'ORCL'
>
>
> (note that m_symbols goes staight to MSFT table, so you save a few cycles by
> going straight to the table instead of having the planner figure out the
> m_symbols table contains Microsoft and O contains Oracle)

This seems to imply either:

1. there are 26 tables one for each letter of the alphabet or
2. that there is some way to alias a table so m_symbols, n_symbols and 
o_symbols all point to the mno_symbols table.

If the tables are split into ABC, DEF, GHI, JKL, MNO, PQR, STU, VWX, YZ, 
then o_symbols and m_symbols would both need to point to the MNO table.

So I'm assuming that in that example you were thinking of one table per 
letter which probably works fine also and might be easier to make 
assumptions on the table name based on the symbol.

It is likely that I will encapsulate most standard queries in stored 
procedures where you pass it the symbols and it generates the SQL for 
the queries so I can hide the complexity of query generation.

Thanks,
   -Steve W

> Leo
> http://www.postgis.us
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list