[postgis-users] Query to select dangling line segments

Kevin Neufeld kneufeld.ca at gmail.com
Tue Nov 30 19:32:45 PST 2010


Glad you got it working.

Just a little note of warning ... I noticed you are using "UNION", not 
"UNION ALL".  The difference is that the later will perform a simple 
concatenation of the two results sets.  The former is kind of like a set 
union, returning results that are in the first or second query, removing 
duplicate rows.  In your 8 second query, not a big deal, but in your 3 
second query, you've include a geometry object.  In determining record 
equivalence to filter duplicate rows, it is the = operator that is 
invoked against two geometry objects, which goes back to comparing 
bounding box equivalence, not geometry equivalence.

i.e.
SELECT 'POINT(0 0)'::geometry
UNION
SELECT 'POINT(0 0.0000001)'::geometry;
                   geometry
--------------------------------------------
  010100000000000000000000000000000000000000
(1 row)


SELECT 'POINT(0 0)'::geometry
UNION ALL
SELECT 'POINT(0 0.0000001)'::geometry;
                   geometry
--------------------------------------------
  010100000000000000000000000000000000000000
  0101000000000000000000000048AFBC9AF2D77A3E
(2 rows)

Not a big deal, but I just wanted to make sure you knew what was 
happening with your query.
UNION ALL should also be faster as it's not trying to merge two results 
sets together.

Cheers,
Kevin


On 11/30/2010 2:34 PM, Brian Stempin wrote:
>
> So, I decided to run with the following query:
>
> SELECT osm_id
> FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
> WHERE osm_id NOT IN (
>
>    -- list all ids where the startpoint intersects something.
>    SELECT t1.osm_id
>    FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
>    "OSMData".osm_mn_data_highway_20101129_101234 t2
>    WHERE t1.osm_id <> t2.osm_id
>    AND ST_Intersects(ST_StartPoint(t1.way), t2.way)
>
> )
> UNION
> SELECT osm_id
> FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
> WHERE osm_id NOT IN (
>
>    -- list all ids where the startpoint intersects something.
>    SELECT t1.osm_id
>    FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
>    "OSMData".osm_mn_data_highway_20101129_101234 t2
>    WHERE t1.osm_id <> t2.osm_id
>    AND ST_Intersects(ST_EndPoint(t1.way), t2.way)
>
> )
>
> In my dataset, this takes ~ 8 seconds to run.  Being a bit of a 
> performance junkee, I modified it to the following:
>
> SELECT osm_id, ST_StartPoint(way)
> FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
> WHERE osm_id NOT IN (
>
>    -- list all ids where the startpoint intersects something.
>    SELECT t1.osm_id
>    FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
>    "OSMData".osm_mn_data_highway_20101129_101234 t2
>    WHERE t1.osm_id <> t2.osm_id
>    AND t2.way ~ t1.way
>    AND ST_Intersects(ST_StartPoint(t1.way), t2.way)
>
> )
> UNION
> SELECT osm_id, ST_EndPoint(way)
> FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
> WHERE osm_id NOT IN (
>
>    -- list all ids where the startpoint intersects something.
>    SELECT t1.osm_id
>    FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
>    "OSMData".osm_mn_data_highway_20101129_101234 t2
>    WHERE t1.osm_id <> t2.osm_id
>    AND t2.way ~ t1.way
>    AND ST_Intersects(ST_EndPoint(t1.way), t2.way)
>
> )
>
> Note the addition of the "t2.way ~ t1.way" bit.  "A ~ B" means "A's 
> bounding box contains B's bounding box."  Since in my case B is a 
> point, I eliminate a lot of comparisons by only looking at shapes 
> who's bounding box B is contained in.  This reduced my query time from 
> 8.x seconds to 3.4x seconds.
>
> Thanks a ton, Kevin!
>
> Brian
> PS -- anyone else spot any other improvements that I can make?  I plan 
> on writing about this in a blog article as part of a larger piece 
> later tonight/tomorrow.
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20101130/f375198e/attachment.html>


More information about the postgis-users mailing list