[postgis-commits] svn - r2639 - in trunk/extras/tiger_geocoder: .
orig tables
postgis-commits at postgis.refractions.net
postgis-commits at postgis.refractions.net
Tue Jul 3 14:05:05 PDT 2007
Author: snowman
Date: 2007-07-03 14:05:03 -0700 (Tue, 03 Jul 2007)
New Revision: 2639
Added:
trunk/extras/tiger_geocoder/INSTALL
trunk/extras/tiger_geocoder/README
trunk/extras/tiger_geocoder/orig/
trunk/extras/tiger_geocoder/orig/tiger_geocoder.sql
trunk/extras/tiger_geocoder/tables/
trunk/extras/tiger_geocoder/tables/lookup_tables.sql
Removed:
trunk/extras/tiger_geocoder/tiger_geocoder.sql
Log:
- Minor reorg, add in other parts of the initial load
Added: trunk/extras/tiger_geocoder/INSTALL
===================================================================
--- trunk/extras/tiger_geocoder/INSTALL 2007-07-03 20:51:31 UTC (rev 2638)
+++ trunk/extras/tiger_geocoder/INSTALL 2007-07-03 21:05:03 UTC (rev 2639)
@@ -0,0 +1,34 @@
+TIGER Geocoder
+
+ 2004/10/28
+
+ A plpgsql based geocoder written for TIGER census data.
+
+Installation instructions:
+
+- If the database being used is new, ensure the following scripts have
+ been loaded:
+
+ /opt/pgsql74/share/postgis.sql
+ /opt/pgsql74/share/contrib/spatial_ref_sys.sql
+
+- Load the fuzzy string matching functions. These must first be compiled,
+ and may be found in the contrib directory of the postgres source directory.
+
+ psql [-p (port)] (database) < /usr/local/src/postgresql-7.4.5/contrib/fuzzystrmatch/fuzzystrmatch.sql
+
+- Ensure that the TIGER data is loaded into the target database.
+
+- Load the lookup tables. This creates the lookup tables and loads predefined
+ data into them. If the standardized TIGER data is not being used, this
+ script will need to be altered to reflect the actual data. Since the place
+ and countysub lookup tables are generated to reflect the data in use, the
+ database must be populated before this script is run. Indicies are also
+ created in this script.
+
+ psql [-p (port)] (database) < lookup_tables.sql
+
+- Load the function scripts. This script loads the geocode function, and all
+ support modules and functions required by it.
+
+ psql [-p (port)] (database) < tiger_geocoder.sql
Added: trunk/extras/tiger_geocoder/README
===================================================================
--- trunk/extras/tiger_geocoder/README 2007-07-03 20:51:31 UTC (rev 2638)
+++ trunk/extras/tiger_geocoder/README 2007-07-03 21:05:03 UTC (rev 2639)
@@ -0,0 +1,349 @@
+TIGER Geocoder
+
+ 2004/10/28
+
+ A plpgsql based geocoder written for TIGER census data.
+
+Design:
+
+There are two components to the geocoder, the address normalizer and the
+address geocoder. These two components are described seperatly below.
+
+The goal of this project is to build a fully functional geocoder that can
+process an arbitrary address string and, using normalized TIGER censes data,
+produce a point geometry reflecting the location of the given address.
+
+- The geocoder should be simple for anyone familiar with PostGIS to install
+ and use.
+- It should be robust enough to function properly despite formatting and
+ spelling errors.
+- It should be extensible enough to be used with future data updates, or
+ alternate data sources with a minimum of coding changes.
+
+Installation:
+
+ Refer to the INSTALL file for installation instructions.
+
+Usage:
+
+ refcursor geocode(refcursor, 'address string');
+
+Notes:
+
+- The assumed format for the address is the US Postal Service standard:
+ () indicates a field required by the geocoder, [] indicates an optional field.
+
+ (address) [dirPrefix] (streetName) [streetType] [dirSuffix]
+ [internalAddress] [location] [state] [zipCode]
+
+
+
+Address Normalizer:
+
+The goal of the address normalizer is to provide a robust function to break a
+given address string down into the components of an address. While the
+normalizer is built specifically for the normalized US TIGER Census data, it
+has been designed to be reasonably extensible to other data sets and localities.
+
+Usage:
+
+ normalize_address('address string');
+
+Support functions:
+
+ location_extract_countysub_exact('partial address string', 'state abbreviation')
+ location_extract_countysub_fuzzy('partial address string', 'state abbreviation')
+ location_extract_place_exact('partial address string', 'state abbreviation')
+ location_extract_place_fuzzy('partial address string', 'state abbreviation')
+ cull_null('string')
+ count_words('string')
+ get_last_words('string')
+ state_extract('partial address string')
+ levenshtein_ignore_case('string', 'string')
+
+Notes:
+
+- A set of lookup tables, listed below, is used to provide street type,
+ secondary unit and direction abbreviation standards for a given set
+ of data. These are provided with the geocoder, but will need to be
+ customized for the data used.
+
+ direction_lookup
+ secondary_unit_lookup
+ street_type_lookup
+
+- Additional lookup tables are required to perform matching for state
+ and location extraction. The state lookup is derived from the
+ US Postal Service standards, while the place and county subdivision
+ lookups are generated from the dataset. The creation statements for
+ the place and countysub tables are given in the INSTALL file.
+
+ state_lookup
+ place_lookup
+ countysub_lookup
+
+- The use of lookup tables is intended to provide a versatile way of applying
+ the normalizer to data sets and localities other than the US Census TIGER
+ data. However, due to the need for matching based extraction in the event
+ of poorly formatted or incomplete address strings, assumptions are made about
+ the data available. Most notably the division of place and county
+ subdivision. For data sets without exactly two logical divisions in location
+ precision, code changes will be required.
+
+- The normalizer will perform better the more information is provided.
+
+- The process for normalization is roughly as follows:
+
+ Extract the address from the beginning.
+ Extract the zipCode from the end.
+ Extract the state, using a fuzzy search if exact matching fails.
+ Attempt to extract the location by parsing the punctuation
+ of the address.
+ Find and remove any internal address.
+ If internal address was found:
+ Set location as everything between internal address and state.
+ Extract the street type from the string.
+ If multiple potential street types are found:
+ If internal address was found:
+ Extract the last street type that preceeds the internal address.
+ Else:
+ Extract the last street type.
+ If street type was found:
+ If a word beginning with a number follows the street type.
+ This indicates the street type is part of the street name,
+ eg. 'State Hwy 92a'.
+ Set street type to NULL.
+ Else if location not yet found:
+ Set location as everything between street type and state.
+ Extract direction prefix from start of street name.
+ If internal address was found:
+ Extract direction suffix from end of street name.
+ Else:
+ Extract direction suffix from start of location.
+ Set street name as everything that is not the address, direction
+ prefix or suffix, internal address, location, state or
+ zip code.
+ Else:
+ If internal address was found:
+ Extract direction prefix from beginning of string.
+ Extract direction suffix before internal address.
+ Set street name as everything that is not the address, direction
+ prefix or suffix, internal address, location, state or
+ zip code.
+ Else:
+ Extract direction suffix.
+ If direction suffix is found:
+ Set location as everything between direction suffix and state,
+ zip or end of string as appropriate.
+ Extract direction prefix from beginning of string.
+ Set street name as everything that is not the address, direction
+ prefix or suffix, internal address, location, state or
+ zip code.
+ Else:
+ Attempt to determine the location via exact comparison against
+ the places lookup.
+ Attempt to determine the location via exact comparison against
+ the countysub lookup.
+ Attempt to determine the location via fuzzy comparison against
+ the places lookup.
+ Attempt to determine the location via fuzzy comparison against
+ the countysub lookup.
+ Extract direction prefix.
+ Set street name as everything that is not the address, direction
+ prefix or suffix, internal address, location, state or
+ zip code.
+
+
+
+Address Geocoder:
+
+The goal of the address geocoder is to provide a robust means of searching
+the database for a match to whatever data the user provides. To accomplish
+this, the coder uses a series of checks and fallthrough cases. Starting with
+the most specific combination of parameters, the algorithm works outwards
+towards the most vague combination, until valid results are found. The result
+of this is that the more accurate information that is provided, the faster the
+algorithm will return.
+
+Usage:
+
+ normalize_address('address string');
+
+Support functions:
+
+ geocode_address(cursor, address, 'dirPrefix', 'streetName', 'streetType',
+ 'dirSuffix', 'location', 'state', zipCode)
+ geocode_address_zip(cursor, address, 'dirPrefix', 'streetName',
+ 'streetType', 'dirSuffix', zipCode)
+ geocode_address_countysub_exact(cursor, address, 'dirPrefix', 'streetName',
+ 'streetType', 'dirSuffix', 'location', 'state')
+ geocode_address_countysub_fuzzy(cursor, address, 'dirPrefix', 'streetName',
+ 'streetType', 'dirSuffix', 'location', 'state')
+ geocode_address_place_exact(cursor, address, 'dirPrefix', 'streetName',
+ 'streetType', 'dirSuffix', 'location', 'state')
+ geocode_address_place_fuzzy(cursor, address, 'dirPrefix', 'streetName',
+ 'streetType', 'dirSuffix', 'location', 'state')
+ rate_attributes('dirPrefixA', 'dirPrefixB', 'streetNameA', 'streetNameB',
+ 'streetTypeA', 'streetTypeB', 'dirSuffixA', 'dirSuffixB')
+ rate_attributes('dirPrefixA', 'dirPrefixB', 'streetNameA', 'streetNameB',
+ 'streetTypeA', 'streetTypeB', 'dirSuffixA', 'dirSuffixB',
+ 'locationA', 'locationB')
+ location_extract_countysub_exact('partial address string', 'state abbreviation')
+ location_extract_countysub_fuzzy('partial address string', 'state abbreviation')
+ location_extract_place_exact('partial address string', 'state abbreviation')
+ location_extract_place_fuzzy('partial address string', 'state abbreviation')
+ cull_null('string')
+ count_words('string')
+ get_last_words('string')
+ state_extract('partial address string')
+ levenshtein_ignore_case('string', 'string')
+ interpolate_from_address(given address, from address L, to address L,
+ from address R, to address R, street segment)
+ interpolate_from_address(given address, 'from address L', 'to address L',
+ 'from address R', 'to address R', street segment)
+ includes_address(given address, from address L, to address L,
+ from address R, to address R)
+ includes_address(given address, 'from address L', 'to address L',
+ 'from address R', 'to address R')
+
+Notes:
+
+- The geocoder is quite dependent on the address normalizer. The direction
+ prefix and suffix, streetType and state are all expected to be standard
+ abbreviations that will match exactly to the database.
+
+- Either a zip code, or a location must be provided. No exception will be
+ thrown, but the result will be null. If the zip code or location cannot
+ be matched, with the other information provided, against the database
+ the result is null.
+
+- The process is as follows:
+
+ If a zipCode is provided:
+ Check if the zipCode, streetName and optionally state match any roads.
+ If they do:
+ Check if the given address fits any of the roads.
+ If it does:
+ Return the matching road segment information, rating and
+ interpolated geographic point.
+ If location exactly matches a place:
+ Check if the place, streetName and optionally state match any roads.
+ If they do:
+ Check if the given address fits any of the roads.
+ If it does:
+ Return the matching road segment information, rating and
+ interpolated geographic point.
+ If location exactly matches a countySubdivision:
+ Check if the countySubdivision, streetName and optionally state
+ match any roads.
+ If they do:
+ Check if the given address fits any of the roads.
+ If it does:
+ Return the matching road segment information, rating and
+ interpolated geographic point.
+ If location approximately matches a place:
+ Check if the place, streetName and optionally state match any roads.
+ If they do:
+ Check if the given address fits any of the roads.
+ If it does:
+ Return the matching road segment information, rating and
+ interpolated geographic point.
+ If location approximately matches a countySubdivision:
+ Check if the countySubdivision, streetName and optionally state
+ match any roads.
+ If they do:
+ Check if the given address fits any of the roads.
+ If it does:
+ Return the matching road segment information, rating and
+ interpolated geographic point.
+
+
+Current Issues / Known Failures:
+
+- If a location starts with a direction, eg. East Seattle, and no suffix
+ direction is given, the direction from the location will be interpreted
+ as the streets suffix direction.
+
+ '18196 68th Ave East Seattle Washington'
+ address = 18196
+ dirPrefix = NULL
+ streetName = '68th'
+ streetType = 'Ave'
+ dirSuffix = 'E'
+ location = 'Seattle'
+ state = 'WA'
+ zip = NULL
+
+- The last possible street type in the string is interpreted as the street type
+ to allow street names to contain type words. As a result, any location
+ containing a street type will have the type interpreted as the street type.
+
+ '29645 7th Street SW Federal Way 98023'
+ address = 29645
+ dirPrefix = NULL
+ streetName = 7th Street SW Federal
+ streetType = Way
+ dirSuffix = NULL
+ location = NULL
+ state = NULL
+ zip = 98023
+
+- While some state misspellings will be picked up by the fuzzy searches,
+ misspelled or non-standard abbreviations may not be picked up, due to
+ the length (soundex uses an intial character plus three codeable
+ characters)
+
+ '2554 E Highland Dr Seatel Wash'
+ address = 2554
+ dirPrefix = 'E'
+ streetName = 'Highland'
+ streetType = 'Dr'
+ dirSuffix = NULL
+ location = 'Seatel Wash'
+ state = NULL
+ zip = NULL
+
+- If neither a location or a zip code are found by the normalizer, no search
+ is performed.
+
+- If neither street type, direction suffix nor location are given in the
+ address string, the street name is generally misclassified as the
+ location.
+
+ '98 E Main Washington 98012'
+ address = 98
+ dirPrefix = 'E'
+ streetName = NULL
+ streetType = NULL
+ dirSuffix = NULL
+ location = 'Main'
+ state = 'WA'
+ zip = 98012
+
+- If no street type is given and the street name contains a type word, then the
+ type in the street name is interpreted as the street type.
+
+ '1348 SW Orchard Seattle wa 98106'
+ 1348::SW:Orch::Seattle:WA:98106
+ address = 1348
+ dirPrefix = NULL
+ streetName = SW
+ streetType = Orch
+ dirSuffix = NULL
+ location = Seattle
+ state = WA
+ zip = 98106
+
+- Misspellings of words are only handled so far as their soundex values match.
+
+ 'Hiland' will not be matched with 'Highland'
+ soundex('Hiland') = 'H453'
+ soundex('Highland') = 'H245'
+
+- Missing words in location or street name are not handled.
+
+ 'Redmond Fall' will not be matched with 'Redmond Fall City'
+
+- Unacceptable failure cases:
+ The street name is parsed out as 'West Central Park'
+ '500 South West Central Park Ave Chicago Illinois 60624'
Copied: trunk/extras/tiger_geocoder/orig/tiger_geocoder.sql (from rev 2638, trunk/extras/tiger_geocoder/tiger_geocoder.sql)
Added: trunk/extras/tiger_geocoder/tables/lookup_tables.sql
===================================================================
--- trunk/extras/tiger_geocoder/tables/lookup_tables.sql 2007-07-03 20:51:31 UTC (rev 2638)
+++ trunk/extras/tiger_geocoder/tables/lookup_tables.sql 2007-07-03 21:05:03 UTC (rev 2639)
@@ -0,0 +1,730 @@
+-- Create direction lookup table
+BEGIN;
+CREATE TABLE direction_lookup (name VARCHAR(20), abbrev VARCHAR(3));
+INSERT INTO direction_lookup (name, abbrev) VALUES('WEST', 'W');
+INSERT INTO direction_lookup (name, abbrev) VALUES('W', 'W');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SW', 'SW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH-WEST', 'SW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTHWEST', 'SW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH-EAST', 'SE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTHEAST', 'SE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH_WEST', 'SW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH_EAST', 'SE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH', 'S');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH WEST', 'SW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SOUTH EAST', 'SE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('SE', 'SE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('S', 'S');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NW', 'NW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH-WEST', 'NW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTHWEST', 'NW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH-EAST', 'NE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTHEAST', 'NE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH_WEST', 'NW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH_EAST', 'NE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH', 'N');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH WEST', 'NW');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NORTH EAST', 'NE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('NE', 'NE');
+INSERT INTO direction_lookup (name, abbrev) VALUES('N', 'N');
+INSERT INTO direction_lookup (name, abbrev) VALUES('EAST', 'E');
+INSERT INTO direction_lookup (name, abbrev) VALUES('E', 'E');
+COMMIT;
+
+
+
+-- Create secondary unit lookup table
+BEGIN;
+CREATE TABLE secondary_unit_lookup (name VARCHAR(20), abbrev VARCHAR(5));
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('APARTMENT', 'APT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('APT', 'APT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('BASEMENT', 'BSMT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('BSMT', 'BSMT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('BUILDING', 'BLDG');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('BLDG', 'BLDG');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('DEPARTMENT', 'DEPT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('DEPT', 'DEPT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('FLOOR', 'FL');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('FL', 'FL');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('FRONT', 'FRNT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('FRNT', 'FRNT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('HANGAR', 'HNGR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('HNGR', 'HNGR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('LOBBY', 'LBBY');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('LBBY', 'LBBY');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('LOT', 'LOT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('LOWER', 'LOWR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('LOWR', 'LOWR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('OFFICE', 'OFC');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('OFC', 'OFC');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('PENTHOUSE', 'PH');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('PH', 'PH');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('PIER', 'PIER');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('REAR', 'REAR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('ROOM', 'RM');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('RM', 'RM');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('SIDE', 'SIDE');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('SLIP', 'SLIP');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('SPACE', 'SPC');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('SPC', 'SPC');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('STOP', 'STOP');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('SUITE', 'STE');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('STE', 'STE');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('TRAILER', 'TRLR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('TRLR', 'TRLR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('UNIT', 'UNIT');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('UPPER', 'UPPR');
+INSERT INTO secondary_unit_lookup (name, abbrev) VALUES ('UPPR', 'UPPR');
+COMMIT;
+
+
+
+-- Create state lookup table
+BEGIN;
+CREATE TABLE state_lookup (name VARCHAR(40), abbrev VARCHAR(3));
+INSERT INTO state_lookup (name, abbrev) VALUES ('Alabama', 'AL');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Alaska', 'AK');
+INSERT INTO state_lookup (name, abbrev) VALUES ('American Samoa', 'AS');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Arizona', 'AZ');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Arkansas', 'AR');
+INSERT INTO state_lookup (name, abbrev) VALUES ('California', 'CA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Colorado', 'CO');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Connecticut', 'CT');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Delaware', 'DE');
+INSERT INTO state_lookup (name, abbrev) VALUES ('District of Columbia', 'DC');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Federated States of Micronesia', 'FM');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Florida', 'FL');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Georgia', 'GA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Guam', 'GU');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Hawaii', 'HI');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Idaho', 'ID');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Illinois', 'IL');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Indiana', 'IN');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Iowa', 'IA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Kansas', 'KS');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Kentucky', 'KY');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Louisiana', 'LA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Maine', 'ME');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Marshall Islands', 'MH');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Maryland', 'MD');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Massachusetts', 'MA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Michigan', 'MI');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Minnesota', 'MN');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Mississippi', 'MS');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Missouri', 'MO');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Montana', 'MT');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Nebraska', 'NE');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Nevada', 'NV');
+INSERT INTO state_lookup (name, abbrev) VALUES ('New Hampshire', 'NH');
+INSERT INTO state_lookup (name, abbrev) VALUES ('New Jersey', 'NJ');
+INSERT INTO state_lookup (name, abbrev) VALUES ('New Mexico', 'NM');
+INSERT INTO state_lookup (name, abbrev) VALUES ('New York', 'NY');
+INSERT INTO state_lookup (name, abbrev) VALUES ('North Carolina', 'NC');
+INSERT INTO state_lookup (name, abbrev) VALUES ('North Dakota', 'ND');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Northern Mariana Islands', 'MP');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Ohio', 'OH');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Oklahoma', 'OK');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Oregon', 'OR');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Palau', 'PW');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Pennsylvania', 'PA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Puerto Rico', 'PR');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Rhode Island', 'RI');
+INSERT INTO state_lookup (name, abbrev) VALUES ('South Carolina', 'SC');
+INSERT INTO state_lookup (name, abbrev) VALUES ('South Dakota', 'SD');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Tennessee', 'TN');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Texas', 'TX');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Utah', 'UT');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Vermont', 'VT');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Virgin Islands', 'VI');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Virginia', 'VA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Washington', 'WA');
+INSERT INTO state_lookup (name, abbrev) VALUES ('West Virginia', 'WV');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Wisconsin', 'WI');
+INSERT INTO state_lookup (name, abbrev) VALUES ('Wyoming', 'WY');
+COMMIT;
+
+
+-- Create street type lookup table
+BEGIN;
+CREATE TABLE street_type_lookup (name VARCHAR(20), abbrev VARCHAR(4));
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ALLEE', 'Aly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ALLEY', 'Aly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ALLY', 'Aly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ALY', 'Aly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ANEX', 'Anx');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ANNEX', 'Anx');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ANNX', 'Anx');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ANX', 'Anx');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ARC', 'Arc');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ARCADE', 'Arc');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AV', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVE', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVEN', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVENU', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVENUE', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVN', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('AVNUE', 'Ave');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BAYOO', 'Byu');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BAYOU', 'Byu');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BCH', 'Bch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BEACH', 'Bch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BEND', 'Bnd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BND', 'Bnd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLF', 'Blf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLUF', 'Blf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLUFF', 'Blf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLUFFS', 'Blfs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOT', 'Btm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOTTM', 'Btm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOTTOM', 'Btm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BTM', 'Btm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLVD', 'Blvd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOUL', 'Blvd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOULEVARD', 'Blvd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BOULV', 'Blvd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BR', 'Br');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRANCH', 'Br');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRNCH', 'Br');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRDGE', 'Brg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRG', 'Brg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRIDGE', 'Brg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRK', 'Brk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BROOK', 'Brk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BROOKS', 'Brks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BURG', 'Bg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BURGS', 'Bgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYP', 'Byp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYPA', 'Byp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYPAS', 'Byp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYPASS', 'ByP');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYPS', 'Byp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CAMP', 'Cp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CMP', 'Cp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CP', 'Cp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CANYN', 'Cyn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CANYON', 'Cyn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CNYN', 'Cyn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CYN', 'Cyn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CAPE', 'Cpe');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CPE', 'Cpe');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CAUSEWAY', 'Cswy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CAUSWAY', 'Cswy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CSWY', 'Cswy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CEN', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CENT', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CENTER', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CENTR', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CENTRE', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CNTER', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CNTR', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CTR', 'Ctr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CENTERS', 'Ctrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIR', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIRC', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIRCL', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIRCLE', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRCL', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRCLE', 'Cir');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIRCLES', 'Cirs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLF', 'Clf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLIFF', 'Clf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLFS', 'Clfs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLIFFS', 'Clfs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLB', 'Clb');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CLUB', 'Clb');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COMMON', 'Cmn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COR', 'Cor');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CORNER', 'Cor');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CORNERS', 'Cors');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CORS', 'Cors');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COURSE', 'Crse');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSE', 'Crse');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COURT', 'Ct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRT', 'Ct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CT', 'Ct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COURTS', 'Cts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COVE', 'Cv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CV', 'Cv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('COVES', 'Cvs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CK', 'Crk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CR', 'Crk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CREEK', 'Crk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRK', 'Crk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRECENT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRES', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRESCENT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRESENT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSCNT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSENT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSNT', 'Cres');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CREST', 'Crst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CROSSING', 'Xing');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSSING', 'Xing');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRSSNG', 'Xing');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('XING', 'Xing');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CROSSROAD', 'Xrd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CURVE', 'Curv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DALE', 'Dl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DL', 'Dl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DAM', 'Dm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DM', 'Dm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DIV', 'Dv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DIVIDE', 'Dv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DV', 'Dv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DVD', 'Dv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DR', 'Dr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DRIV', 'Dr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DRIVE', 'Dr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DRV', 'Dr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DRIVES', 'Drs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EST', 'Est');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ESTATE', 'Est');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ESTATES', 'Ests');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ESTS', 'Ests');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXP', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXPR', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXPRESS', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXPRESSWAY', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXPW', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXPY', 'Expy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXT', 'Ext');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXTENSION', 'Ext');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXTN', 'Ext');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXTNSN', 'Ext');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXTENSIONS', 'Exts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('EXTS', 'Exts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FALL', 'Fall');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FALLS', 'Fls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLS', 'Fls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FERRY', 'Fry');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRRY', 'Fry');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRY', 'Fry');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FIELD', 'Fld');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLD', 'Fld');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FIELDS', 'Flds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLDS', 'Flds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLAT', 'Flt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLT', 'Flt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLATS', 'Flts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FLTS', 'Flts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORD', 'Frd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRD', 'Frd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORDS', 'Frds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FOREST', 'Frst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORESTS', 'Frst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRST', 'Frst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORG', 'Frg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORGE', 'Frg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRG', 'Frg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORGES', 'Frgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORK', 'Frk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRK', 'Frk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORKS', 'Frks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRKS', 'Frks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FORT', 'Ft');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRT', 'Ft');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FT', 'Ft');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FREEWAY', 'Fwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FREEWY', 'Fwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRWAY', 'Fwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRWY', 'Fwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FWY', 'Fwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GARDEN', 'Gdn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GARDN', 'Gdn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GDN', 'Gdn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRDEN', 'Gdn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRDN', 'Gdn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GARDENS', 'Gdns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GDNS', 'Gdns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRDNS', 'Gdns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GATEWAY', 'Gtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GATEWY', 'Gtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GATWAY', 'Gtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GTWAY', 'Gtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GTWY', 'Gtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GLEN', 'Gln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GLN', 'Gln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GLENS', 'Glns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GREEN', 'Grn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRN', 'Grn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GREENS', 'Grns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GROV', 'Grv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GROVE', 'Grv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRV', 'Grv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GROVES', 'Grvs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HARB', 'Hbr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HARBOR', 'Hbr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HARBR', 'Hbr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HBR', 'Hbr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HRBOR', 'Hbr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HARBORS', 'Hbrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HAVEN', 'Hvn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HAVN', 'Hvn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HVN', 'Hvn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HEIGHT', 'Hts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HEIGHTS', 'Hts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HGTS', 'Hts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HT', 'Hts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HTS', 'Hts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HIGHWAY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HIGHWY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HIWAY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HIWY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HWAY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HWY', 'Hwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HILL', 'Hl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HL', 'Hl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HILLS', 'Hls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HLS', 'Hls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HLLW', 'Holw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HOLLOW', 'Holw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HOLLOWS', 'Holw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HOLW', 'Holw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HOLWS', 'Holw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('INLET', 'Inlt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('INLT', 'Inlt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('IS', 'Is');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLAND', 'Is');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLND', 'Is');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLANDS', 'Iss');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLNDS', 'Iss');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISS', 'Iss');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLE', 'Isle');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ISLES', 'Isle');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JCT', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JCTION', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JCTN', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JUNCTION', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JUNCTN', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JUNCTON', 'Jct');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JCTNS', 'Jcts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JCTS', 'Jcts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('JUNCTIONS', 'Jcts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KEY', 'Ky');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KY', 'Ky');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KEYS', 'Kys');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KYS', 'Kys');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KNL', 'Knl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KNOL', 'Knl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KNOLL', 'Knl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KNLS', 'Knls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('KNOLLS', 'Knls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LAKE', 'Lk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LK', 'Lk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LAKES', 'Lks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LKS', 'Lks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LAND', 'Land');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LANDING', 'Lndg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LNDG', 'Lndg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LNDNG', 'Lndg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LA', 'Ln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LANE', 'Ln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LANES', 'Ln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LN', 'Ln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LGT', 'Lgt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LIGHT', 'Lgt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LIGHTS', 'Lgts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LF', 'Lf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LOAF', 'Lf');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LCK', 'Lck');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LOCK', 'Lck');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LCKS', 'Lcks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LOCKS', 'Lcks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LDG', 'Ldg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LDGE', 'Ldg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LODG', 'Ldg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LODGE', 'Ldg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LOOP', 'Loop');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LOOPS', 'Loop');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MALL', 'Mall');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MANOR', 'Mnr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNR', 'Mnr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MANORS', 'Mnrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNRS', 'Mnrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MDW', 'Mdw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MEADOW', 'Mdw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MDWS', 'Mdws');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MEADOWS', 'Mdws');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MEDOWS', 'Mdws');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MEWS', 'Mews');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MILL', 'Ml');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ML', 'Ml');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MILLS', 'Mls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MLS', 'Mls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MISSION', 'Msn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MISSN', 'Msn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MSN', 'Msn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MSSN', 'Msn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MOTORWAY', 'Mtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNT', 'Mt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MOUNT', 'Mt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MT', 'Mt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNTAIN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNTN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MOUNTAIN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MOUNTIN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MTIN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MTN', 'Mtn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MNTNS', 'Mtns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MOUNTAINS', 'Mtns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('NCK', 'Nck');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('NECK', 'Nck');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ORCH', 'Orch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ORCHARD', 'Orch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ORCHRD', 'Orch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('OVAL', 'Oval');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('OVL', 'Oval');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('OVERPASS', 'Opas');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PARK', 'Park');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PK', 'Park');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRK', 'Park');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PARKS', 'Park');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PARKWAY', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PARKWY', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PKWAY', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PKWY', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PKY', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PARKWAYS', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PKWYS', 'Pkwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PASS', 'Pass');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PASSAGE', 'Psge');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PATH', 'Path');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PATHS', 'Path');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PIKE', 'Pike');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PIKES', 'Pike');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PINE', 'Pne');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PINES', 'Pnes');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PNES', 'Pnes');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PL', 'Pl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLACE', 'Pl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLAIN', 'Pln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLN', 'Pln');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLAINES', 'Plns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLAINS', 'Plns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLNS', 'Plns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLAZA', 'Plz');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLZ', 'Plz');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PLZA', 'Plz');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('POINT', 'Pt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PT', 'Pt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('POINTS', 'Pts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PTS', 'Pts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PORT', 'Prt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRT', 'Prt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PORTS', 'Prts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRTS', 'Prts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PR', 'Pr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRAIRIE', 'Pr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRARIE', 'Pr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PRR', 'Pr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RAD', 'Radl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RADIAL', 'Radl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RADIEL', 'Radl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RADL', 'Radl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RAMP', 'Ramp');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RANCH', 'Rnch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RANCHES', 'Rnch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RNCH', 'Rnch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RNCHS', 'Rnch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RAPID', 'Rpd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RPD', 'Rpd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RAPIDS', 'Rpds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RPDS', 'Rpds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('REST', 'Rst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RST', 'Rst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RDG', 'Rdg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RDGE', 'Rdg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RIDGE', 'Rdg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RDGS', 'Rdgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RIDGES', 'Rdgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RIV', 'Riv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RIVER', 'Riv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RIVR', 'Riv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RVR', 'Riv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RD', 'Rd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ROAD', 'Rd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RDS', 'Rds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ROADS', 'Rds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ROUTE', 'Rte');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ROW', 'Row');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RUE', 'Rue');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RUN', 'Run');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHL', 'Shl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHOAL', 'Shl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHLS', 'Shls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHOALS', 'Shls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHOAR', 'Shr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHORE', 'Shr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHR', 'Shr');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHOARS', 'Shrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHORES', 'Shrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SHRS', 'Shrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SKYWAY', 'Skwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPG', 'Spg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPNG', 'Spg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPRING', 'Spg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPRNG', 'Spg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPGS', 'Spgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPNGS', 'Spgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPRINGS', 'Spgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPRNGS', 'Spgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPUR', 'Spur');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SPURS', 'Spur');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQ', 'Sq');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQR', 'Sq');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQRE', 'Sq');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQU', 'Sq');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQUARE', 'Sq');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQRS', 'Sqs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQUARES', 'Sqs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STA', 'Sta');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STATION', 'Sta');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STATN', 'Sta');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STN', 'Sta');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRA', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRAV', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRAVE', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRAVEN', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRAVENUE', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRAVN', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRVN', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRVNUE', 'Stra');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STREAM', 'Strm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STREME', 'Strm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRM', 'Strm');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ST', 'St');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STR', 'St');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STREET', 'St');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STRT', 'St');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STREETS', 'Sts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SMT', 'Smt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SUMIT', 'Smt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SUMITT', 'Smt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SUMMIT', 'Smt');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TER', 'Ter');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TERR', 'Ter');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TERRACE', 'Ter');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('THROUGHWAY', 'Trwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRACE', 'Trce');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRACES', 'Trce');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRCE', 'Trce');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRACK', 'Trak');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRACKS', 'Trak');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRAK', 'Trak');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRK', 'Trak');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRKS', 'Trak');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRAFFICWAY', 'Trfy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRFY', 'Trfy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TR', 'Trl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRAIL', 'Trl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRAILS', 'Trl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRL', 'Trl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRLS', 'Trl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNEL', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNL', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNLS', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNNEL', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNNELS', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TUNNL', 'Tunl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TPK', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TPKE', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRNPK', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRPK', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TURNPIKE', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TURNPK', 'Tpke');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UNDERPASS', 'Upas');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UN', 'Un');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UNION', 'Un');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UNIONS', 'Uns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VALLEY', 'Vly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VALLY', 'Vly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VLLY', 'Vly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VLY', 'Vly');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VALLEYS', 'Vlys');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VLYS', 'Vlys');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VDCT', 'Via');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIA', 'Via');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIADCT', 'Via');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIADUCT', 'Via');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIEW', 'Vw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VW', 'Vw');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIEWS', 'Vws');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VWS', 'Vws');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILL', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLAG', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLAGE', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLG', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLIAGE', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VLG', 'Vlg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLAGES', 'Vlgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VLGS', 'Vlgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VILLE', 'Vl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VL', 'Vl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIS', 'Vis');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VIST', 'Vis');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VISTA', 'Vis');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VST', 'Vis');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('VSTA', 'Vis');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WALK', 'Walk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WALKS', 'Walk');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WALL', 'Wall');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WAY', 'Way');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WY', 'Way');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WAYS', 'Ways');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WELL', 'Wl');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WELLS', 'Wls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WLS', 'Wls');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BYU', 'Byu');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BLFS', 'Blfs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BRKS', 'Brks');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BG', 'Bg');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('BGS', 'Bgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CTRS', 'Ctrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CIRS', 'Cirs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CMN', 'Cmn');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CTS', 'Cts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CVS', 'Cvs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CRST', 'Crst');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('XRD', 'Xrd');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('CURV', 'Curv');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('DRS', 'Drs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRDS', 'Frds');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('FRGS', 'Frgs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GLNS', 'Glns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRNS', 'Grns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('GRVS', 'Grvs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('HBRS', 'Hbrs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('LGTS', 'Lgts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MTWY', 'Mtwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('MTNS', 'Mtns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ORCH', 'Orch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('ORCH', 'Orch');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('OPAS', 'Opas');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PSGE', 'Psge');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('PNE', 'Pne');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('RTE', 'Rte');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SKWY', 'Skwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('SQS', 'Sqs');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('STS', 'Sts');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('TRWY', 'Trwy');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UPAS', 'Upas');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('UNS', 'Uns');
+INSERT INTO street_type_lookup (name, abbrev) VALUES ('WL', 'Wl');
+COMMIT;
+
+-- Create place and countysub lookup tables
+SELECT name, state INTO TABLE place_lookup FROM gazetteer_places group by name, state;
+SELECT name, state INTO TABLE countysub_lookup FROM gazetteer_county_subdivisions group by name, state;
+
+-- Create indicies
+create index tiger_geocode_roads_zip_soundex_idx on tiger_geocode_roads (soundex(fename), zip, state);
+create index tiger_geocode_roads_place_soundex_idx on tiger_geocode_roads (soundex(fename), place, state);
+create index tiger_geocode_roads_cousub_soundex_idx on tiger_geocode_roads (soundex(fename), cousub, state);
+create index tiger_geocode_roads_place_more_soundex_idx on tiger_geocode_roads (soundex(fename), soundex(place), state);
+create index tiger_geocode_roads_cousub_more_soundex_idx on tiger_geocode_roads (soundex(fename), soundex(cousub), state);
+create index tiger_geocode_roads_state_soundex_idx on tiger_geocode_roads (soundex(fename), state);
+create index tiger_geocode_join_id_idx on tiger_geocode_join (id);
+create index roads_local_tlid_idx on roads_local (tlid);
+create index place_lookup_idx on place_lookup (state);
+create index countysub_lookup_idx on countysub_lookup (state);
+
Deleted: trunk/extras/tiger_geocoder/tiger_geocoder.sql
===================================================================
--- trunk/extras/tiger_geocoder/tiger_geocoder.sql 2007-07-03 20:51:31 UTC (rev 2638)
+++ trunk/extras/tiger_geocoder/tiger_geocoder.sql 2007-07-03 21:05:03 UTC (rev 2639)
@@ -1,2657 +0,0 @@
--- Runs the soundex function on the last word in the string provided.
--- Words are allowed to be seperated by space, comma, period, new-line
--- tab or form feed.
-CREATE OR REPLACE FUNCTION end_soundex(VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- tempString VARCHAR;
-BEGIN
- tempString := substring($1, ''[ ,\.\n\t\f]([a-zA-Z0-9]*)$'');
- IF tempString IS NOT NULL THEN
- tempString := soundex(tempString);
- ELSE
- tempString := soundex($1);
- END IF;
- return tempString;
-END;
-' LANGUAGE plpgsql;
-
--- Returns the value passed, or an empty string if null.
--- This is used to concatinate values that may be null.
-CREATE OR REPLACE FUNCTION cull_null(VARCHAR) RETURNS VARCHAR
-AS '
-BEGIN
- IF $1 IS NULL THEN
- return '''';
- ELSE
- return $1;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
--- Determine the number of words in a string. Words are allowed to
--- be seperated only by spaces, but multiple spaces between
--- words are allowed.
-CREATE OR REPLACE FUNCTION count_words(VARCHAR) RETURNS INTEGER
-AS '
-DECLARE
- tempString VARCHAR;
- tempInt INTEGER;
- count INTEGER := 1;
- lastSpace BOOLEAN := FALSE;
-BEGIN
- IF $1 IS NULL THEN
- return -1;
- END IF;
- tempInt := length($1);
- IF tempInt = 0 THEN
- return 0;
- END IF;
- FOR i IN 1..tempInt LOOP
- tempString := substring($1 from i for 1);
- IF tempString = '' '' THEN
- IF NOT lastSpace THEN
- count := count + 1;
- END IF;
- lastSpace := TRUE;
- ELSE
- lastSpace := FALSE;
- END IF;
- END LOOP;
- return count;
-END;
-' LANGUAGE plpgsql;
-
-
-
-CREATE OR REPLACE FUNCTION geocode(VARCHAR) RETURNS REFCURSOR
-AS '
-BEGIN
- return geocode(NULL, $1);
-END;
-' LANGUAGE plpgsql;
-
-CREATE OR REPLACE FUNCTION geocode(REFCURSOR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- input VARCHAR;
- parsed VARCHAR;
- addressString VARCHAR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetType VARCHAR;
- directionSuffix VARCHAR;
- location VARCHAR;
- state VARCHAR;
- zipCodeString VARCHAR;
- zipCode INTEGER;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode()'';
- END IF;
- -- Check inputs.
- IF $1 IS NOT NULL THEN
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address string is manditory.
- RAISE EXCEPTION ''geocode() - No address string provided.'';
- ELSE
- input := $2;
- END IF;
-
- -- Pass the input string into the address normalizer
- parsed := normalize_address(input);
- IF parsed IS NULL THEN
- RAISE EXCEPTION ''geocode() - address string failed to parse.'';
- END IF;
-
- addressString := split_part(parsed, '':'', 1);
- directionPrefix := split_part(parsed, '':'', 2);
- streetName := split_part(parsed, '':'', 3);
- streetType := split_part(parsed, '':'', 4);
- directionSuffix := split_part(parsed, '':'', 5);
- location := split_part(parsed, '':'', 6);
- state := split_part(parsed, '':'', 7);
- zipCodeString := split_part(parsed, '':'', 8);
-
- -- Empty strings must be converted to nulls;
- IF addressString = '''' THEN
- addressString := NULL;
- END IF;
- IF directionPrefix = '''' THEN
- directionPrefix := NULL;
- END IF;
- IF streetName = '''' THEN
- streetName := NULL;
- END IF;
- IF streetType = '''' THEN
- streetType := NULL;
- END IF;
- IF directionSuffix = '''' THEN
- directionSuffix := NULL;
- END IF;
- IF location = '''' THEN
- location := NULL;
- END IF;
- IF state = '''' THEN
- state := NULL;
- END IF;
- IF zipCodeString = '''' THEN
- zipCodeString := NULL;
- END IF;
-
- -- address and zipCode must be integers
- IF addressString IS NOT NULL THEN
- address := to_number(addressString, ''99999999999'');
- END IF;
- IF zipCodeString IS NOT NULL THEN
- zipCode := to_number(zipCodeString, ''99999'');
- END IF;
-
- IF verbose THEN
- RAISE NOTICE ''geocode() - address %'', address;
- RAISE NOTICE ''geocode() - directionPrefix %'', directionPrefix;
- RAISE NOTICE ''geocode() - streetName "%"'', streetName;
- RAISE NOTICE ''geocode() - streetType %'', streetType;
- RAISE NOTICE ''geocode() - directionSuffix %'', directionSuffix;
- RAISE NOTICE ''geocode() - location "%"'', location;
- RAISE NOTICE ''geocode() - state %'', state;
- RAISE NOTICE ''geocode() - zipCode %'', zipCode;
- END IF;
- -- This is where any validation above the geocode_address functions would go.
-
- -- Call geocode_address
- result := geocode_address(result, address, directionPrefix, streetName,
- streetType, directionSuffix, location, state, zipCode);
- RETURN result;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- geocode(cursor, address, directionPrefix, streetName,
--- streetTypeAbbreviation, directionSuffix, location, stateAbbreviation,
--- zipCode)
-CREATE OR REPLACE FUNCTION geocode_address(refcursor, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, INTEGER) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- location VARCHAR;
- stateAbbrev VARCHAR;
- state VARCHAR;
- zipCode INTEGER;
- tempString VARCHAR;
- tempInt VARCHAR;
- locationPlaceExact BOOLEAN := FALSE;
- locationPlaceFuzzy BOOLEAN := FALSE;
- locationCountySubExact BOOLEAN := FALSE;
- locationCountySubFuzzy BOOLEAN := FALSE;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The result was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NOT NULL THEN
- -- Location is not needed iff a zip is given. The check occurs after
- -- the geocode_address_zip call.
- location := $7;
- END IF;
- IF $8 IS NULL THEN
- -- State abbreviation is manditory. It is also assumed to be valid.
- ELSE
- stateAbbrev := $8;
- END IF;
- IF $9 IS NOT NULL THEN
- -- Zip code is optional, but nice.
- zipCode := $9;
- END IF;
-
- -- The geocoding tables store the state name rather than the abbreviation.
- -- We can validate the abbreviation while retrieving the name.
- IF stateAbbrev IS NOT NULL THEN
- SELECT INTO state name FROM state_lookup
- WHERE state_lookup.abbrev = stateAbbrev;
- IF state IS NULL THEN
- END IF;
- END IF;
-
- IF zipCode IS NOT NULL THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_zip()'';
- END IF;
- -- If the zip code is given, it is the most useful way to narrow the
- -- search. We will try it first, and if no results match, we will move
- -- on to a location search. There is no fuzzy searching on zip codes.
- result := geocode_address_zip(result, address, directionPrefix, streetName,
- streetTypeAbbrev, directionSuffix, zipCode);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- -- After now, the location becomes manditory.
- IF location IS NOT NULL THEN
- -- location may be useful, it may not. The first step is to determine if
- -- there are any potenial matches in the place and countysub fields.
- -- This is done against the lookup tables, and will save us time on much
- -- larger queries if they dont match.
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling location_extract_place_*()'';
- END IF;
- tempString := location_extract_place_exact(location, stateAbbrev);
- IF tempString IS NOT NULL THEN
- locationPlaceExact := TRUE;
- ELSE
- locationPlaceExact := FALSE;
- END IF;
- tempString := location_extract_place_fuzzy(location, stateAbbrev);
- IF tempString IS NOT NULL THEN
- locationPlaceFuzzy := true;
- ELSE
- locationPlaceFuzzy := false;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling location_extract_countysub_*()'';
- END IF;
- tempString := location_extract_countysub_exact(location, stateAbbrev);
- IF tempString IS NOT NULL THEN
- locationCountySubExact := TRUE;
- ELSE
- locationCountySubExact := FALSE;
- END IF;
- tempString := location_extract_countysub_fuzzy(location, stateAbbrev);
- IF tempString IS NOT NULL THEN
- locationCountySubFuzzy := true;
- ELSE
- locationCountySubFuzzy := false;
- END IF;
- END IF;
- IF locationPlaceExact THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_place_exact()'';
- END IF;
- result := geocode_address_place_exact(result, address, directionPrefix,
- streetName, streetTypeAbbrev, directionSuffix, location, state);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- IF locationCountySubExact THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_countysub_exact()'';
- END IF;
- result := geocode_address_countysub_exact(result, address, directionPrefix,
- streetName, streetTypeAbbrev, directionSuffix, location, state);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- IF locationPlaceFuzzy THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_place_fuzzy()'';
- END IF;
- result := geocode_address_place_fuzzy(result, address, directionPrefix,
- streetName, streetTypeAbbrev, directionSuffix, location, state);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- IF locationCountySubFuzzy THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_countysub_fuzzy()'';
- END IF;
- result := geocode_address_countysub_fuzzy(result, address, directionPrefix,
- streetName, streetTypeAbbrev, directionSuffix, location, state);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- IF state IS NOT NULL THEN
- IF verbose THEN
- RAISE NOTICE ''geocode_address() - calling geocode_address_state()'';
- END IF;
- result := geocode_address_state(result, address, directionPrefix,
- streetName, streetTypeAbbrev, directionSuffix, state);
- IF result IS NOT NULL THEN
- RETURN result;
- ELSE
- result := $1;
- END IF;
- END IF;
- RETURN NULL;
-END;
-' LANGUAGE plpgsql;
-
-
-
-CREATE OR REPLACE FUNCTION geocode_address_countysub_exact(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- state VARCHAR;
- location VARCHAR;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_exact()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_countysub_exact() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_countysub_exact() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NULL THEN
- -- location is manditory. This is the location geocoder after all.
- RAISE EXCEPTION ''geocode_address_countysub_exact() - No location provided!'';
- ELSE
- location := $7;
- END IF;
- IF $8 IS NOT NULL THEN
- state := $8;
- END IF;
-
- -- Check to see if the road name can be matched.
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state;
- ELSE
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_exact() - % potential matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- RETURN NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- ELSE
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_exact() - % address matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- IF state IS NOT NULL THEN
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.cousub) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.cousub) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.cousub
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- RETURN result;
- END IF;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-CREATE OR REPLACE FUNCTION geocode_address_countysub_fuzzy(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- state VARCHAR;
- location VARCHAR;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_fuzzy()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_countysub_fuzzy() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_countysub_fuzzy() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NULL THEN
- -- location is manditory. This is the location geocoder after all.
- RAISE EXCEPTION ''geocode_address_countysub_fuzzy() - No location provided!'';
- ELSE
- location := $7;
- END IF;
- IF $8 IS NOT NULL THEN
- state := $8;
- END IF;
-
- -- Check to see if the road name can be matched.
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state;
- ELSE
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_fuzzy() - % potential matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- RETURN NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- ELSE
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_countysub_fuzzy() - % address matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- IF state IS NOT NULL THEN
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.cousub) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.cousub) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.cousub)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- RETURN result;
- END IF;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-CREATE OR REPLACE FUNCTION geocode_address_place_exact(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- state VARCHAR;
- location VARCHAR;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_exact()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_place_exact() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_place_exact() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NULL THEN
- -- location is manditory. This is the location geocoder after all.
- RAISE EXCEPTION ''geocode_address_place_exact() - No location provided!'';
- ELSE
- location := $7;
- END IF;
- IF $8 IS NOT NULL THEN
- state := $8;
- END IF;
-
- -- Check to see if the road name can be matched.
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state;
- ELSE
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_exact() - % potential matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- RETURN NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- ELSE
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_exact() - % address matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- IF state IS NOT NULL THEN
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.place) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.place) as rating
- FROM tiger_geocode_roads
- WHERE location = tiger_geocode_roads.place
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- RETURN result;
- END IF;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-
-CREATE OR REPLACE FUNCTION geocode_address_place_fuzzy(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- state VARCHAR;
- location VARCHAR;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_fuzzy()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_place_fuzzy() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_place_fuzzy() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NULL THEN
- -- location is manditory. This is the location geocoder after all.
- RAISE EXCEPTION ''geocode_address_place_fuzzy() - No location provided!'';
- ELSE
- location := $7;
- END IF;
- IF $8 IS NOT NULL THEN
- state := $8;
- END IF;
-
- -- Check to see if the road name can be matched.
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state;
- ELSE
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_fuzzy() - % potential matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- RETURN NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- IF state IS NOT NULL THEN
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- ELSE
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_place_fuzzy() - % address matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- IF state IS NOT NULL THEN
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.place) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs, location,
- tiger_geocode_roads.place) as rating
- FROM tiger_geocode_roads
- WHERE soundex(location) = soundex(tiger_geocode_roads.place)
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- RETURN result;
- END IF;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-
-CREATE OR REPLACE FUNCTION geocode_address_state(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- state VARCHAR;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_state()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_state() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_state() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NOT NULL THEN
- state := $7;
- ELSE
- -- It is unreasonable to do a country wide search. State is already
- -- pretty sketchy. No state, no search.
- RAISE EXCEPTION ''geocode_address_state() - No state name provided!'';
- END IF;
-
- -- Check to see if the road name can be matched.
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_state() - % potential matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- RETURN NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- IF verbose THEN
- RAISE NOTICE ''geocode_address_state() - % address matches.'', tempInt;
- END IF;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE soundex(streetName) = soundex(tiger_geocode_roads.fename)
- AND state = tiger_geocode_roads.state
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-
-CREATE OR REPLACE FUNCTION geocode_address_zip(REFCURSOR, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, INTEGER) RETURNS REFCURSOR
-AS '
-DECLARE
- result REFCURSOR;
- address INTEGER;
- directionPrefix VARCHAR;
- streetName VARCHAR;
- streetTypeAbbrev VARCHAR;
- directionSuffix VARCHAR;
- zipCode INTEGER;
- tempString VARCHAR;
- tempInt VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''geocode_address_zip()'';
- END IF;
- -- The first step is to determine what weve been given, and if its enough.
- IF $1 IS NOT NULL THEN
- -- The cursor was not provided. No matter, we can use an unnamed one.
- result := $1;
- END IF;
- IF $2 IS NULL THEN
- -- The address is manditory.
- -- Without it, wed be wandering into strangers homes all the time.
- RAISE EXCEPTION ''geocode_address_zip() - No address provided!'';
- ELSE
- address := $2;
- END IF;
- IF $3 IS NOT NULL THEN
- -- The direction prefix really isnt important.
- -- It will be used for rating if provided.
- directionPrefix := $3;
- END IF;
- IF $4 IS NULL THEN
- -- A street name must be given. Think about it.
- RAISE EXCEPTION ''geocode_address_zip() - No street name provided!'';
- ELSE
- streetName := $4;
- END IF;
- IF $5 IS NOT NULL THEN
- -- A street type will be used for rating if provided, but isnt required.
- streetTypeAbbrev := $5;
- END IF;
- IF $6 IS NOT NULL THEN
- -- Same as direction prefix, only later.
- directionSuffix := $6;
- END IF;
- IF $7 IS NULL THEN
- -- Zip code is not optional.
- RAISE EXCEPTION ''geocode_address_zip() - No zip provided!'';
- ELSE
- zipCode := $7;
- END IF;
-
- -- Check to see if the road name can be matched.
- SELECT INTO tempInt count(*) FROM tiger_geocode_roads
- WHERE zipCode = tiger_geocode_roads.zip
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename);
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- -- The road name matches, now we check to see if the addresses match
- SELECT INTO tempInt count(*)
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE zipCode = tiger_geocode_roads.zip
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid;
- IF tempInt = 0 THEN
- return NULL;
- ELSE
- OPEN result FOR
- SELECT *, interpolate_from_address(address, roads_local.fraddl,
- roads_local.toaddl, roads_local.fraddr, roads_local.toaddr,
- roads_local.geom) as address_geom
- FROM (
- SELECT *, rate_attributes(directionPrefix, tiger_geocode_roads.fedirp,
- streetName, tiger_geocode_roads.fename, streetTypeAbbrev,
- tiger_geocode_roads.fetype, directionSuffix,
- tiger_geocode_roads.fedirs) as rating
- FROM tiger_geocode_roads
- WHERE zipCode = tiger_geocode_roads.zip
- AND soundex(streetName) = soundex(tiger_geocode_roads.fename)
- ) AS subquery, tiger_geocode_join, roads_local
- WHERE includes_address(address, roads_local.fraddl, roads_local.toaddl,
- roads_local.fraddr, roads_local.toaddr)
- AND subquery.id = tiger_geocode_join.id
- AND tiger_geocode_join.tlid = roads_local.tlid
- ORDER BY subquery.rating;
- return result;
- END IF;
- END IF;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- Returns a string consisting of the last N words. Words are allowed
--- to be seperated only by spaces, but multiple spaces between
--- words are allowed. Words must be alphanumberic.
--- If more words are requested than exist, the full input string is
--- returned.
-CREATE OR REPLACE FUNCTION get_last_words(VARCHAR, INTEGER) RETURNS VARCHAR
-AS '
-DECLARE
- inputString VARCHAR;
- tempString VARCHAR;
- count VARCHAR;
- result VARCHAR := '''';
-BEGIN
- IF $1 IS NULL THEN
- return NULL;
- ELSE
- inputString := $1;
- END IF;
- IF $2 IS NULL THEN
- RAISE EXCEPTION ''get_last_words() - word count is null!'';
- ELSE
- count := $2;
- END IF;
- FOR i IN 1..count LOOP
- tempString := substring(inputString from ''((?: )+[a-zA-Z0-9_]*)'' || result || ''$'');
- IF tempString IS NULL THEN
- return inputString;
- END IF;
- result := tempString || result;
- END LOOP;
- result := trim(both from result);
- return result;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- This function converts the string addresses to integers and passes them
--- to the other includes_address function.
-CREATE OR REPLACE FUNCTION includes_address(INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS BOOLEAN
-AS '
-DECLARE
- given_address INTEGER;
- addr1 INTEGER;
- addr2 INTEGER;
- addr3 INTEGER;
- addr4 INTEGER;
- result BOOLEAN;
-BEGIN
- given_address = $1;
- addr1 = to_number($2, ''999999'');
- addr2 = to_number($3, ''999999'');
- addr3 = to_number($4, ''999999'');
- addr4 = to_number($5, ''999999'');
- result = includes_address(given_address, addr1, addr2, addr3, addr4);
- RETURN result;
-END
-' LANGUAGE plpgsql;
-
-
-
--- This function requires the addresses to be grouped, such that the second and
--- third arguments are from one side of the street, and the fourth and fifth
--- from the other.
-CREATE OR REPLACE FUNCTION includes_address(INTEGER, INTEGER, INTEGER, INTEGER, INTEGER) RETURNS BOOLEAN
-AS '
-DECLARE
- given_address INTEGER;
- addr1 INTEGER;
- addr2 INTEGER;
- addr3 INTEGER;
- addr4 INTEGER;
- lmaxaddr INTEGER := -1;
- rmaxaddr INTEGER := -1;
- lminaddr INTEGER := -1;
- rminaddr INTEGER := -1;
- maxaddr INTEGER := -1;
- minaddr INTEGER := -1;
-BEGIN
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''includes_address() - local address is NULL!'';
- ELSE
- given_address := $1;
- END IF;
-
- IF $2 IS NOT NULL THEN
- addr1 := $2;
- maxaddr := addr1;
- minaddr := addr1;
- lmaxaddr := addr1;
- lminaddr := addr1;
- END IF;
-
- IF $3 IS NOT NULL THEN
- addr2 := $3;
- IF addr2 < minaddr OR minaddr = -1 THEN
- minaddr := addr2;
- END IF;
- IF addr2 > maxaddr OR maxaddr = -1 THEN
- maxaddr := addr2;
- END IF;
- IF addr2 > lmaxaddr OR lmaxaddr = -1 THEN
- lmaxaddr := addr2;
- END IF;
- IF addr2 < lminaddr OR lminaddr = -1 THEN
- lminaddr := addr2;
- END IF;
- END IF;
-
- IF $4 IS NOT NULL THEN
- addr3 := $4;
- IF addr3 < minaddr OR minaddr = -1 THEN
- minaddr := addr3;
- END IF;
- IF addr3 > maxaddr OR maxaddr = -1 THEN
- maxaddr := addr3;
- END IF;
- rmaxaddr := addr3;
- rminaddr := addr3;
- END IF;
-
- IF $5 IS NOT NULL THEN
- addr4 := $5;
- IF addr4 < minaddr OR minaddr = -1 THEN
- minaddr := addr4;
- END IF;
- IF addr4 > maxaddr OR maxaddr = -1 THEN
- maxaddr := addr4;
- END IF;
- IF addr4 > rmaxaddr OR rmaxaddr = -1 THEN
- rmaxaddr := addr4;
- END IF;
- IF addr4 < rminaddr OR rminaddr = -1 THEN
- rminaddr := addr4;
- END IF;
- END IF;
-
- IF minaddr = -1 OR maxaddr = -1 THEN
- -- No addresses were non-null, return FALSE (arbitrary)
- RETURN FALSE;
- ELSIF given_address >= minaddr AND given_address <= maxaddr THEN
- -- The address is within the given range
- IF given_address >= lminaddr AND given_address <= lmaxaddr THEN
- -- This checks to see if the address is on this side of the
- -- road, ie if the address is even, the street range must be even
- IF (given_address % 2) = (lminaddr % 2)
- OR (given_address % 2) = (lmaxaddr % 2) THEN
- RETURN TRUE;
- END IF;
- END IF;
- IF given_address >= rminaddr AND given_address <= rmaxaddr THEN
- -- See above
- IF (given_address % 2) = (rminaddr % 2)
- OR (given_address % 2) = (rmaxaddr % 2) THEN
- RETURN TRUE;
- END IF;
- END IF;
- END IF;
- -- The address is not within the range
- RETURN FALSE;
-END;
-
-' LANGUAGE plpgsql;
-
-
-
--- This function converts string addresses to integers and passes them to
--- the other interpolate_from_address function.
-CREATE OR REPLACE FUNCTION interpolate_from_address(INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, GEOMETRY) RETURNS GEOMETRY
-AS '
-DECLARE
- given_address INTEGER;
- addr1 INTEGER;
- addr2 INTEGER;
- addr3 INTEGER;
- addr4 INTEGER;
- road GEOMETRY;
- result GEOMETRY;
-BEGIN
- given_address := $1;
- addr1 := to_number($2, ''999999'');
- addr2 := to_number($3, ''999999'');
- addr3 := to_number($4, ''999999'');
- addr4 := to_number($5, ''999999'');
- road := $6;
- result = interpolate_from_address(given_address, addr1, addr2, addr3, addr4, road);
- RETURN result;
-END
-' LANGUAGE plpgsql;
-
--- interpolate_from_address(local_address, from_address_l, to_address_l, from_address_r, to_address_r, local_road)
--- This function returns a point along the given geometry (must be linestring)
--- corresponding to the given address. If the given address is not within
--- the address range of the road, null is returned.
--- This function requires that the address be grouped, such that the second and
--- third arguments are from one side of the street, while the fourth and
--- fifth are from the other.
-CREATE OR REPLACE FUNCTION interpolate_from_address(INTEGER, INTEGER, INTEGER, INTEGER, INTEGER, GEOMETRY) RETURNS GEOMETRY
-AS '
-DECLARE
- given_address INTEGER;
- lmaxaddr INTEGER := -1;
- rmaxaddr INTEGER := -1;
- lminaddr INTEGER := -1;
- rminaddr INTEGER := -1;
- lfrgreater BOOLEAN;
- rfrgreater BOOLEAN;
- frgreater BOOLEAN;
- addrwidth INTEGER;
- part DOUBLE PRECISION;
- road GEOMETRY;
- result GEOMETRY;
-BEGIN
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''interpolate_from_address() - local address is NULL!'';
- ELSE
- given_address := $1;
- END IF;
-
- IF $6 IS NULL THEN
- RAISE EXCEPTION ''interpolate_from_address() - local road is NULL!'';
- ELSE
- IF geometrytype($6) = ''LINESTRING'' THEN
- road := $6;
- ELSIF geometrytype($6) = ''MULTILINESTRING'' THEN
- road := geometryn($6,1);
- ELSE
- RAISE EXCEPTION ''interpolate_from_address() - local road is not a line!'';
- END IF;
- END IF;
-
- IF $2 IS NOT NULL THEN
- lfrgreater := TRUE;
- lmaxaddr := $2;
- lminaddr := $2;
- END IF;
-
- IF $3 IS NOT NULL THEN
- IF $3 > lmaxaddr OR lmaxaddr = -1 THEN
- lmaxaddr := $3;
- lfrgreater := FALSE;
- END IF;
- IF $3 < lminaddr OR lminaddr = -1 THEN
- lminaddr := $3;
- END IF;
- END IF;
-
- IF $4 IS NOT NULL THEN
- rmaxaddr := $4;
- rminaddr := $4;
- rfrgreater := TRUE;
- END IF;
-
- IF $5 IS NOT NULL THEN
- IF $5 > rmaxaddr OR rmaxaddr = -1 THEN
- rmaxaddr := $5;
- rfrgreater := FALSE;
- END IF;
- IF $5 < rminaddr OR rminaddr = -1 THEN
- rminaddr := $5;
- END IF;
- END IF;
-
- IF given_address >= lminaddr AND given_address <= lmaxaddr THEN
- IF (given_address % 2) = (lminaddr % 2)
- OR (given_address % 2) = (lmaxaddr % 2) THEN
- addrwidth := lmaxaddr - lminaddr;
- part := (given_address - lminaddr) / trunc(addrwidth, 1);
- frgreater := lfrgreater;
- END IF;
- END IF;
- IF given_address >= rminaddr AND given_address <= rmaxaddr THEN
- IF (given_address % 2) = (rminaddr % 2)
- OR (given_address % 2) = (rmaxaddr % 2) THEN
- addrwidth := rmaxaddr - rminaddr;
- part := (given_address - rminaddr) / trunc(addrwidth, 1);
- frgreater := rfrgreater;
- END IF;
- ELSE
- RETURN null;
- END IF;
-
- IF frgreater THEN
- part := 1 - part;
- END IF;
-
- result = line_interpolate_point(road, part);
- RETURN result;
-END;
-' LANGUAGE plpgsql;
-
-
--- This function determines the levenshtein distance irespective of case.
-CREATE OR REPLACE FUNCTION levenshtein_ignore_case(VARCHAR, VARCHAR) RETURNS INTEGER
-AS '
-DECLARE
- result INTEGER;
-BEGIN
- result := levenshtein(upper($1), upper($2));
- RETURN result;
-END
-' LANGUAGE plpgsql;
-
--- This function take two arguements. The first is the "given string" and
--- must not be null. The second arguement is the "compare string" and may
--- or may not be null. If the second string is null, the value returned is
--- 3, otherwise it is the levenshtein difference between the two.
-CREATE OR REPLACE FUNCTION nullable_levenshtein(VARCHAR, VARCHAR) RETURNS INTEGER
-AS '
-DECLARE
- given_string VARCHAR;
- result INTEGER := 3;
-BEGIN
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''nullable_levenshtein - given string is NULL!'';
- ELSE
- given_string := $1;
- END IF;
-
- IF $2 IS NOT NULL AND $2 != '''' THEN
- result := levenshtein_ignore_case(given_string, $2);
- END IF;
-
- RETURN result;
-END
-' LANGUAGE plpgsql;
-
-
-
--- location_extract(streetAddressString, stateAbbreviation)
--- This function extracts a location name from the end of the given string.
--- The first attempt is to find an exact match against the place_lookup
--- table. If this fails, a word-by-word soundex match is tryed against the
--- same table. If multiple candidates are found, the one with the smallest
--- levenshtein distance from the given string is assumed the correct one.
--- If no match is found against the place_lookup table, the same tests are
--- run against the countysub_lookup table.
---
--- The section of the given string corresponding to the location found is
--- returned, rather than the string found from the tables. All the searching
--- is done largely to determine the length (words) of the location, to allow
--- the intended street name to be correctly identified.
-CREATE OR REPLACE FUNCTION location_extract(VARCHAR, VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- fullStreet VARCHAR;
- stateAbbrev VARCHAR;
- location VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''location_extract()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''location_extract() - No input given!'';
- ELSE
- fullStreet := $1;
- END IF;
- IF $2 IS NULL THEN
- ELSE
- stateAbbrev := $2;
- END IF;
- location := location_extract_place_exact(fullStreet, stateAbbrev);
- IF location IS NULL THEN
- location := location_extract_countysub_exact(fullStreet, stateAbbrev);
- IF location IS NULL THEN
- location := location_extract_place_fuzzy(fullStreet, stateAbbrev);
- IF location IS NULL THEN
- location := location_extract_countysub_fuzzy(fullStreet, stateAbbrev);
- END IF;
- END IF;
- END IF;
- return location;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- location_extract_countysub_exact(string, stateAbbrev)
--- This function checks the place_lookup table to find a potential match to
--- the location described at the end of the given string. If an exact match
--- fails, a fuzzy match is performed. The location as found in the given
--- string is returned.
-CREATE OR REPLACE FUNCTION location_extract_countysub_exact(VARCHAR, VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- fullStreet VARCHAR;
- ws VARCHAR;
- tempString VARCHAR;
- location VARCHAR;
- tempInt INTEGER;
- word_count INTEGER;
- stateAbbrev VARCHAR;
- rec RECORD;
- test BOOLEAN;
- result VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''location_extract_countysub_exact()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''location_extract_countysub_exact() - No input given!'';
- ELSE
- fullStreet := $1;
- END IF;
- IF $2 IS NOT NULL THEN
- stateAbbrev := $2;
- END IF;
- ws := ''[ ,\.\n\f\t]'';
- -- No hope of determining the location from place. Try countysub.
- IF stateAbbrev IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM countysub_lookup
- WHERE countysub_lookup.state = stateAbbrev
- AND texticregexeq(fullStreet, ''(?i)'' || name || ''$'');
- ELSE
- SELECT INTO tempInt count(*) FROM countysub_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || name || ''$'');
- END IF;
- IF tempInt > 0 THEN
- IF stateAbbrev IS NOT NULL THEN
- FOR rec IN SELECT substring(fullStreet, ''(?i)(''
- || name || '')$'') AS value, name FROM countysub_lookup
- WHERE countysub_lookup.state = stateAbbrev
- AND texticregexeq(fullStreet, ''(?i)'' || ws || name ||
- ''$'') ORDER BY length(name) DESC LOOP
- -- Only the first result is needed.
- location := rec.value;
- EXIT;
- END LOOP;
- ELSE
- FOR rec IN SELECT substring(fullStreet, ''(?i)(''
- || name || '')$'') AS value, name FROM countysub_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name ||
- ''$'') ORDER BY length(name) DESC LOOP
- -- again, only the first is needed.
- location := rec.value;
- EXIT;
- END LOOP;
- END IF;
- END IF;
- RETURN location;
-END;
-' LANGUAGE plpgsql;
-
-
--- location_extract_countysub_fuzzy(string, stateAbbrev)
--- This function checks the place_lookup table to find a potential match to
--- the location described at the end of the given string. If an exact match
--- fails, a fuzzy match is performed. The location as found in the given
--- string is returned.
-CREATE OR REPLACE FUNCTION location_extract_countysub_fuzzy(VARCHAR, VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- fullStreet VARCHAR;
- ws VARCHAR;
- tempString VARCHAR;
- location VARCHAR;
- tempInt INTEGER;
- word_count INTEGER;
- stateAbbrev VARCHAR;
- rec RECORD;
- test BOOLEAN;
- result VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''location_extract_countysub_fuzzy()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''location_extract_countysub_fuzzy() - No input given!'';
- ELSE
- fullStreet := $1;
- END IF;
- IF $2 IS NOT NULL THEN
- stateAbbrev := $2;
- END IF;
- ws := ''[ ,\.\n\f\t]'';
-
- -- Fuzzy matching.
- tempString := substring(fullStreet, ''(?i)'' || ws ||
- ''([a-zA-Z0-9]+)$'');
- IF tempString IS NULL THEN
- tempString := fullStreet;
- END IF;
- IF stateAbbrev IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM countysub_lookup
- WHERE countysub_lookup.state = stateAbbrev
- AND soundex(tempString) = end_soundex(name);
- ELSE
- SELECT INTO tempInt count(*) FROM countysub_lookup
- WHERE soundex(tempString) = end_soundex(name);
- END IF;
- IF tempInt > 0 THEN
- tempInt := 50;
- -- Some potentials were found. Begin a word-by-word soundex on each.
- IF stateAbbrev IS NOT NULL THEN
- FOR rec IN SELECT name FROM countysub_lookup
- WHERE countysub_lookup.state = stateAbbrev
- AND soundex(tempString) = end_soundex(name) LOOP
- word_count := count_words(rec.name);
- test := TRUE;
- tempString := get_last_words(fullStreet, word_count);
- FOR i IN 1..word_count LOOP
- IF soundex(split_part(tempString, '' '', i)) !=
- soundex(split_part(rec.name, '' '', i)) THEN
- test := FALSE;
- END IF;
- END LOOP;
- IF test THEN
- -- The soundex matched, determine if the distance is better.
- IF levenshtein_ignore_case(rec.name, tempString) < tempInt THEN
- location := tempString;
- tempInt := levenshtein_ignore_case(rec.name, tempString);
- END IF;
- END IF;
- END LOOP;
- ELSE
- FOR rec IN SELECT name FROM countysub_lookup
- WHERE soundex(tempString) = end_soundex(name) LOOP
- word_count := count_words(rec.name);
- test := TRUE;
- tempString := get_last_words(fullStreet, word_count);
- FOR i IN 1..word_count LOOP
- IF soundex(split_part(tempString, '' '', i)) !=
- soundex(split_part(rec.name, '' '', i)) THEN
- test := FALSE;
- END IF;
- END LOOP;
- IF test THEN
- -- The soundex matched, determine if the distance is better.
- IF levenshtein_ignore_case(rec.name, tempString) < tempInt THEN
- location := tempString;
- tempInt := levenshtein_ignore_case(rec.name, tempString);
- END IF;
- END IF;
- END LOOP;
- END IF;
- END IF; -- If no fuzzys were found, leave location null.
- RETURN location;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- location_extract_place_exact(string, stateAbbrev)
--- This function checks the place_lookup table to find a potential match to
--- the location described at the end of the given string. If an exact match
--- fails, a fuzzy match is performed. The location as found in the given
--- string is returned.
-CREATE OR REPLACE FUNCTION location_extract_place_exact(VARCHAR, VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- fullStreet VARCHAR;
- ws VARCHAR;
- tempString VARCHAR;
- location VARCHAR;
- tempInt INTEGER;
- word_count INTEGER;
- stateAbbrev VARCHAR;
- rec RECORD;
- test BOOLEAN;
- result VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_exact()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''location_extract_place_exact() - No input given!'';
- ELSE
- fullStreet := $1;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_exact() - input: "%"'', fullStreet;
- END IF;
- IF $2 IS NOT NULL THEN
- stateAbbrev := $2;
- END IF;
- ws := ''[ ,\.\n\f\t]'';
- -- Try for an exact match against places
- IF stateAbbrev IS NOT NULL THEN
- SELECT INTO tempInt count(*) FROM place_lookup
- WHERE place_lookup.state = stateAbbrev
- AND texticregexeq(fullStreet, ''(?i)'' || name || ''$'');
- ELSE
- SELECT INTO tempInt count(*) FROM place_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || name || ''$'');
- END IF;
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_exact() - Exact Matches %'', tempInt;
- END IF;
- IF tempInt > 0 THEN
- -- Some matches were found. Look for the last one in the string.
- IF stateAbbrev IS NOT NULL THEN
- FOR rec IN SELECT substring(fullStreet, ''(?i)(''
- || name || '')$'') AS value, name FROM place_lookup
- WHERE place_lookup.state = stateAbbrev
- AND texticregexeq(fullStreet, ''(?i)''
- || name || ''$'') ORDER BY length(name) DESC LOOP
- -- Since the regex is end of string, only the longest (first) result
- -- is useful.
- location := rec.value;
- EXIT;
- END LOOP;
- ELSE
- FOR rec IN SELECT substring(fullStreet, ''(?i)(''
- || name || '')$'') AS value, name FROM place_lookup
- WHERE texticregexeq(fullStreet, ''(?i)''
- || name || ''$'') ORDER BY length(name) DESC LOOP
- -- Since the regex is end of string, only the longest (first) result
- -- is useful.
- location := rec.value;
- EXIT;
- END LOOP;
- END IF;
- END IF;
- RETURN location;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- location_extract_place_fuzzy(string, stateAbbrev)
--- This function checks the place_lookup table to find a potential match to
--- the location described at the end of the given string. If an exact match
--- fails, a fuzzy match is performed. The location as found in the given
--- string is returned.
-CREATE OR REPLACE FUNCTION location_extract_place_fuzzy(VARCHAR, VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- fullStreet VARCHAR;
- ws VARCHAR;
- tempString VARCHAR;
- location VARCHAR;
- tempInt INTEGER;
- word_count INTEGER;
- stateAbbrev VARCHAR;
- rec RECORD;
- test BOOLEAN;
- result VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_fuzzy()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''location_extract_place_fuzzy() - No input given!'';
- ELSE
- fullStreet := $1;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_fuzzy() - input: "%"'', fullStreet;
- END IF;
- IF $2 IS NOT NULL THEN
- stateAbbrev := $2;
- END IF;
- ws := ''[ ,\.\n\f\t]'';
-
- tempString := substring(fullStreet, ''(?i)'' || ws
- || ''([a-zA-Z0-9]+)$'');
- IF tempString IS NULL THEN
- tempString := fullStreet;
- END IF;
- IF stateAbbrev IS NOT NULL THEN
- SELECT into tempInt count(*) FROM place_lookup
- WHERE place_lookup.state = stateAbbrev
- AND soundex(tempString) = end_soundex(name);
- ELSE
- SELECT into tempInt count(*) FROM place_lookup
- WHERE soundex(tempString) = end_soundex(name);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_fuzzy() - Fuzzy matches %'', tempInt;
- END IF;
- IF tempInt > 0 THEN
- -- Some potentials were found. Begin a word-by-word soundex on each.
- tempInt := 50;
- IF stateAbbrev IS NOT NULL THEN
- FOR rec IN SELECT name FROM place_lookup
- WHERE place_lookup.state = stateAbbrev
- AND soundex(tempString) = end_soundex(name) LOOP
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_fuzzy() - Fuzzy: "%"'', rec.name;
- END IF;
- word_count := count_words(rec.name);
- test := TRUE;
- tempString := get_last_words(fullStreet, word_count);
- FOR i IN 1..word_count LOOP
- IF soundex(split_part(tempString, '' '', i)) !=
- soundex(split_part(rec.name, '' '', i)) THEN
- IF verbose THEN
- RAISE NOTICE ''location_extract_place_fuzzy() - No Match.'';
- END IF;
- test := FALSE;
- END IF;
- END LOOP;
- IF test THEN
- -- The soundex matched, determine if the distance is better.
- IF levenshtein_ignore_case(rec.name, tempString) < tempInt THEN
- location := tempString;
- tempInt := levenshtein_ignore_case(rec.name, tempString);
- END IF;
- END IF;
- END LOOP;
- ELSE
- FOR rec IN SELECT name FROM place_lookup
- WHERE soundex(tempString) = end_soundex(name) LOOP
- word_count := count_words(rec.name);
- test := TRUE;
- tempString := get_last_words(fullStreet, word_count);
- FOR i IN 1..word_count LOOP
- IF soundex(split_part(tempString, '' '', i)) !=
- soundex(split_part(rec.name, '' '', i)) THEN
- test := FALSE;
- END IF;
- END LOOP;
- IF test THEN
- -- The soundex matched, determine if the distance is better.
- IF levenshtein_ignore_case(rec.name, tempString) < tempInt THEN
- location := tempString;
- tempInt := levenshtein_ignore_case(rec.name, tempString);
- END IF;
- END IF;
- END LOOP;
- END IF;
- END IF;
- RETURN location;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- normalize_address(addressString)
--- This takes an address string and parses it into address (internal/street)
--- street name, type, direction prefix and suffix, location, state and
--- zip code, depending on what can be found in the string.
---
--- The US postal address standard is used:
--- <Street Number> <Direction Prefix> <Street Name> <Street Type>
--- <Direction Suffix> <Internal Address> <Location> <State> <Zip Code>
---
--- State is assumed to be included in the string, and MUST be matchable to
--- something in the state_lookup table. Fuzzy matching is used if no direct
--- match is found.
---
--- Two formats of zip code are acceptable: five digit, and five + 4.
---
--- The internal addressing indicators are looked up from the
--- secondary_unit_lookup table. A following identifier is accepted
--- but it must start with a digit.
---
--- The location is parsed from the string using other indicators, such
--- as street type, direction suffix or internal address, if available.
--- If these are not, the location is extracted using comparisons against
--- the places_lookup table, then the countysub_lookup table to determine
--- what, in the original string, is intended to be the location. In both
--- cases, an exact match is first pursued, then a word-by-word fuzzy match.
--- The result is not the name of the location from the tables, but the
--- section of the given string that corresponds to the name from the tables.
---
--- Zip codes and street names are not validated.
---
--- Direction indicators are extracted by comparison with the direction_lookup
--- table.
---
--- Street addresses are assumed to be a single word, starting with a number.
--- Address is manditory; if no address is given, and the street is numbered,
--- the resulting address will be the street name, and the street name
--- will be an empty string.
---
--- In some cases, the street type is part of the street name.
--- eg State Hwy 22a. As long as the word following the type starts with a
--- number (this is usually the case) this will be caught. Some street names
--- include a type name, and have a street type that differs. This will be
--- handled properly, so long as both are given. If the street type is
--- omitted, the street names included type will be parsed as the street type.
---
--- The output is currently a colon seperated list of values:
--- InternalAddress:StreetAddress:DirectionPrefix:StreetName:StreetType:
--- DirectionSuffix:Location:State:ZipCode
--- This returns each element as entered. It's mainly meant for debugging.
--- There is also another option that returns:
--- StreetAddress:DirectionPrefixAbbreviation:StreetName:StreetTypeAbbreviation:
--- DirectionSuffixAbbreviation:Location:StateAbbreviation:ZipCode
--- This is more standardized and better for use with a geocoder.
-CREATE OR REPLACE FUNCTION normalize_address(VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- rawInput VARCHAR;
- address VARCHAR;
- preDir VARCHAR;
- preDirAbbrev VARCHAR;
- postDir VARCHAR;
- postDirAbbrev VARCHAR;
- fullStreet VARCHAR;
- reducedStreet VARCHAR;
- streetName VARCHAR;
- streetType VARCHAR;
- streetTypeAbbrev VARCHAR;
- internal VARCHAR;
- location VARCHAR;
- state VARCHAR;
- stateAbbrev VARCHAR;
- tempString VARCHAR;
- tempInt INTEGER;
- result VARCHAR;
- zip VARCHAR;
- test BOOLEAN;
- working REFCURSOR;
- rec RECORD;
- ws VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''normalize_address()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''normalise_address() - address string is null!'';
- ELSE
- rawInput := $1;
- END IF;
- ws := ''[ ,\.\t\n\f\r]'';
-
- -- Assume that the address begins with a digit, and extract it from
- -- the input string.
- address := substring(rawInput from ''^([0-9].*?)[ ,/.]'');
-
- -- There are two formats for zip code, the normal 5 digit, and
- -- the nine digit zip-4. It may also not exist.
- zip := substring(rawInput from ws || ''([0-9]{5})$'');
- IF zip IS NULL THEN
- zip := substring(rawInput from ws || ''([0-9]{5})-[0-9]{4}$'');
- END IF;
-
- IF zip IS NOT NULL THEN
- fullStreet := substring(rawInput from ''(.*)''
- || ws || ''+'' || cull_null(zip) || ''[- ]?([0-9]{4})?$'');
- ELSE
- fullStreet := rawInput;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''normalize_address() - after zip extract "%"'', fullStreet;
- END IF;
- tempString := state_extract(fullStreet);
- IF tempString IS NOT NULL THEN
- state := split_part(tempString, '':'', 1);
- stateAbbrev := split_part(tempString, '':'', 2);
- END IF;
-
- -- The easiest case is if the address is comma delimited. There are some
- -- likely cases:
- -- street level, location, state
- -- street level, location state
- -- street level, location
- -- street level, internal address, location, state
- -- street level, internal address, location state
- -- street level, internal address location state
- -- street level, internal address, location
- -- street level, internal address location
- -- The first three are useful.
- tempString := substring(fullStreet, ''(?i),'' || ws || ''+(.*)(,?'' || ws ||
- ''+'' || cull_null(state) || ''|$)'');
- IF tempString IS NOT NULL THEN
- location := tempString;
- IF address IS NOT NULL THEN
- fullStreet := substring(fullStreet, ''(?i)'' || address || ws ||
- ''+(.*),'' || ws || ''+'' || location);
- ELSE
- fullStreet := substring(fullStreet, ''(?i)(.*),'' || ws || ''+'' ||
- location);
- END IF;
- IF verbose THEN
- RAISE NOTICE ''normalize_address() - Parsed by punctuation.'';
- RAISE NOTICE ''normalize_address() - Location "%"'', location;
- RAISE NOTICE ''normalize_address() - FullStreet "%"'', fullStreet;
- END IF;
- END IF;
-
- -- Pull out the full street information, defined as everything between the
- -- address and the state. This includes the location.
- -- This doesnt need to be done if location has already been found.
- IF location IS NULL THEN
- IF address IS NOT NULL THEN
- IF state IS NOT NULL THEN
- fullStreet := substring(fullStreet, ''(?i)'' || address ||
- ws || ''+(.*?)'' || ws || ''+'' || state);
- ELSE
- fullStreet := substring(fullStreet, ''(?i)'' || address ||
- ws || ''+(.*?)'');
- END IF;
- ELSE
- IF state IS NOT NULL THEN
- fullStreet := substring(fullStreet, ''(?i)(.*?)'' || ws ||
- ''+'' || state);
- ELSE
- fullStreet := substring(fullStreet, ''(?i)(.*?)'');
- END IF;
- END IF;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''normalize_address() - after addy extract "%"'', fullStreet;
- END IF;
-
- -- Determine if any internal address is included, such as apartment
- -- or suite number.
- SELECT INTO tempInt count(*) FROM secondary_unit_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name || ''(''
- || ws || ''|$)'');
- IF tempInt = 1 THEN
- SELECT INTO internal substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || ws || ''*#?'' || ws
- || ''*(?:[0-9][0-9a-zA-Z\-]*)?'' || '')(?:'' || ws || ''|$)'')
- FROM secondary_unit_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name || ''(''
- || ws || ''|$)'');
- ELSIF tempInt > 1 THEN
- -- In the event of multiple matches to a secondary unit designation, we
- -- will assume that the last one is the true one.
- tempInt := 0;
- FOR rec in SELECT trim(substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || ''(?:'' || ws || ''*#?'' || ws
- || ''*(?:[0-9][0-9a-zA-Z\-]*)?)'' || ws || ''?|$)'')) as value
- FROM secondary_unit_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name || ''(''
- || ws || ''|$)'') LOOP
- IF tempInt < position(rec.value in fullStreet) THEN
- tempInt := position(rec.value in fullStreet);
- internal := rec.value;
- END IF;
- END LOOP;
- END IF;
-
- IF verbose THEN
- RAISE NOTICE ''normalize_address() - internal: "%"'', internal;
- END IF;
-
- IF location IS NULL THEN
- -- If the internal address is given, the location is everything after it.
- location := substring(fullStreet, internal || ws || ''+(.*)$'');
- END IF;
-
- -- Pull potential street types from the full street information
- SELECT INTO tempInt count(*) FROM street_type_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || ''('' || name
- || '')(?:'' || ws || ''|$)'');
- IF tempInt = 1 THEN
- SELECT INTO rec abbrev, substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || '')(?:'' || ws || ''|$)'') AS given FROM street_type_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || ''('' || name
- || '')(?:'' || ws || ''|$)'');
- streetType := rec.given;
- streetTypeAbbrev := rec.abbrev;
- ELSIF tempInt > 1 THEN
- tempInt := 0;
- FOR rec IN SELECT abbrev, substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || '')(?:'' || ws || ''|$)'') AS given FROM street_type_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || ''('' || name
- || '')(?:'' || ws || ''|$)'') LOOP
- -- If we have found an internal address, make sure the type
- -- precedes it.
- IF internal IS NOT NULL THEN
- IF position(rec.given IN fullStreet) <
- position(internal IN fullStreet) THEN
- IF tempInt < position(rec.given IN fullStreet) THEN
- streetType := rec.given;
- streetTypeAbbrev := rec.abbrev;
- tempInt := position(rec.given IN fullStreet);
- END IF;
- END IF;
- ELSIF tempInt < position(rec.given IN fullStreet) THEN
- streetType := rec.given;
- streetTypeAbbrev := rec.abbrev;
- tempInt := position(rec.given IN fullStreet);
- END IF;
- END LOOP;
- END IF;
- IF verbose THEN
- RAISE NOTICE ''normalize_address() - street Type: "%"'', streetType;
- END IF;
-
- -- There is a little more processing required now. If the word after the
- -- street type begins with a number, the street type should be considered
- -- part of the name, as well as the next word. eg, State Route 225a. If
- -- the next word starts with a char, then everything after the street type
- -- will be considered location. If there is no street type, then Im sad.
- IF streetType IS NOT NULL THEN
- tempString := substring(fullStreet, streetType || ws ||
- ''+([0-9][^ ,\.\t\r\n\f]*?)'' || ws);
- IF tempString IS NOT NULL THEN
- IF location IS NULL THEN
- location := substring(fullStreet, streetType || ws || ''+''
- || tempString || ws || ''+(.*)$'');
- END IF;
- reducedStreet := substring(fullStreet, ''(.*)'' || ws || ''+''
- || location || ''$'');
- streetType := NULL;
- streetTypeAbbrev := NULL;
- ELSE
- IF location IS NULL THEN
- location := substring(fullStreet, streetType || ws || ''+(.*)$'');
- END IF;
- reducedStreet := substring(fullStreet, ''^(.*)'' || ws || ''+''
- || streetType);
- END IF;
-
- -- The pre direction should be at the beginning of the fullStreet string.
- -- The post direction should be at the beginning of the location string
- -- if there is no internal address
- SELECT INTO tempString substring(reducedStreet, ''(?i)(^'' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- IF tempString IS NOT NULL THEN
- preDir := tempString;
- SELECT INTO preDirAbbrev abbrev FROM direction_lookup
- where texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- streetName := substring(reducedStreet, ''^'' || preDir || ws || ''(.*)'');
- ELSE
- streetName := reducedStreet;
- END IF;
-
- IF texticregexeq(location, ''(?i)'' || internal || ''$'') THEN
- -- If the internal address is at the end of the location, then no
- -- location was given. We still need to look for post direction.
- SELECT INTO rec abbrev,
- substring(location, ''(?i)^('' || name || '')'' || ws) as value
- FROM direction_lookup WHERE texticregexeq(location, ''(?i)^''
- || name || ws) ORDER BY length(name) desc;
- IF rec.value IS NOT NULL THEN
- postDir := rec.value;
- postDirAbbrev := rec.abbrev;
- END IF;
- location := null;
- ELSIF internal IS NULL THEN
- -- If no location is given, the location string will be the post direction
- SELECT INTO tempInt count(*) FROM direction_lookup WHERE
- upper(location) = upper(name);
- IF tempInt != 0 THEN
- postDir := location;
- SELECT INTO postDirAbbrev abbrev FROM direction_lookup WHERE
- upper(postDir) = upper(name);
- location := NULL;
- ELSE
- -- postDirection is not equal location, but may be contained in it.
- SELECT INTO tempString substring(location, ''(?i)(^'' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(location, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) desc;
- IF tempString IS NOT NULL THEN
- postDir := tempString;
- SELECT INTO postDirAbbrev abbrev FROM direction_lookup
- where texticregexeq(location, ''(?i)(^'' || name || '')'' || ws);
- location := substring(location, ''^'' || postDir || ws || ''+(.*)'');
- END IF;
- END IF;
- ELSE
- -- internal is not null, but is not at the end of the location string
- -- look for post direction before the internal address
- SELECT INTO tempString substring(fullStreet, ''(?i)'' || streetType
- || ws || ''+('' || name || '')'' || ws || ''+'' || internal)
- FROM direction_lookup WHERE texticregexeq(fullStreet, ''(?i)''
- || ws || name || ws || ''+'' || internal) ORDER BY length(name) desc;
- IF tempString IS NOT NULL THEN
- postDir := tempString;
- SELECT INTO postDirAbbrev abbrev FROM direction_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name || ws);
- END IF;
- END IF;
- ELSE
- -- No street type was found
-
- -- If an internal address was given, then the split becomes easy, and the
- -- street name is everything before it, without directions.
- IF internal IS NOT NULL THEN
- reducedStreet := substring(fullStreet, ''(?i)^(.*?)'' || ws || ''+''
- || internal);
- SELECT INTO tempInt count(*) FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)'' || ws || name || ''$'');
- IF tempInt > 0 THEN
- SELECT INTO postDir substring(reducedStreet, ''(?i)'' || ws || ''(''
- || name || '')'' || ''$'') FROM direction_lookup
- WHERE texticregexeq(reducedStreet, ''(?i)'' || ws || name || ''$'');
- SELECT INTO postDirAbbrev abbrev FROM direction_lookup
- WHERE texticregexeq(reducedStreet, ''(?i)'' || ws || name || ''$'');
- END IF;
- SELECT INTO tempString substring(reducedStreet, ''(?i)^('' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)^('' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- IF tempString IS NOT NULL THEN
- preDir := tempString;
- SELECT INTO preDirAbbrev abbrev FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- streetName := substring(reducedStreet, ''(?i)^'' || preDir || ws
- || ''+(.*?)(?:'' || ws || ''+'' || cull_null(postDir) || ''|$)'');
- ELSE
- streetName := substring(reducedStreet, ''(?i)^(.*?)(?:'' || ws
- || ''+'' || cull_null(postDir) || ''|$)'');
- END IF;
- ELSE
-
- -- If a post direction is given, then the location is everything after,
- -- the street name is everything before, less any pre direction.
- SELECT INTO tempInt count(*) FROM direction_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name || ''(?:''
- || ws || ''|$)'');
-
- IF tempInt = 1 THEN
- -- A single postDir candidate was found. This makes it easier.
- SELECT INTO postDir substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || '')(?:'' || ws || ''|$)'') FROM direction_lookup WHERE
- texticregexeq(fullStreet, ''(?i)'' || ws || name || ''(?:''
- || ws || ''|$)'');
- SELECT INTO postDirAbbrev abbrev FROM direction_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name
- || ''(?:'' || ws || ''|$)'');
- IF location IS NULL THEN
- location := substring(fullStreet, ''(?i)'' || ws || postDir
- || ws || ''+(.*?)$'');
- END IF;
- reducedStreet := substring(fullStreet, ''^(.*?)'' || ws || ''+''
- || postDir);
- SELECT INTO tempString substring(reducedStreet, ''(?i)(^'' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- IF tempString IS NOT NULL THEN
- preDir := tempString;
- SELECT INTO preDirAbbrev abbrev FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- streetName := substring(reducedStreet, ''^'' || preDir || ws
- || ''+(.*)'');
- ELSE
- streetName := reducedStreet;
- END IF;
- ELSIF tempInt > 1 THEN
- -- Multiple postDir candidates were found. We need to find the last
- -- incident of a direction, but avoid getting the last word from
- -- a two word direction. eg extracting "East" from "North East"
- -- We do this by sorting by length, and taking the last direction
- -- in the results that is not included in an earlier one.
- -- This wont be a problem it preDir is North East and postDir is
- -- East as the regex requires a space before the direction. Only
- -- the East will return from the preDir.
- tempInt := 0;
- FOR rec IN SELECT abbrev, substring(fullStreet, ''(?i)'' || ws || ''(''
- || name || '')(?:'' || ws || ''|$)'') AS value
- FROM direction_lookup
- WHERE texticregexeq(fullStreet, ''(?i)'' || ws || name
- || ''(?:'' || ws || ''|$)'')
- ORDER BY length(name) desc LOOP
- tempInt := 0;
- IF tempInt < position(rec.value in fullStreet) THEN
- IF postDir IS NULL THEN
- tempInt := position(rec.value in fullStreet);
- postDir := rec.value;
- postDirAbbrev := rec.abbrev;
- ELSIF NOT texticregexeq(postDir, ''(?i)'' || rec.value) THEN
- tempInt := position(rec.value in fullStreet);
- postDir := rec.value;
- postDirAbbrev := rec.abbrev;
- END IF;
- END IF;
- END LOOP;
- IF location IS NULL THEN
- location := substring(fullStreet, ''(?i)'' || ws || postDir || ws
- || ''+(.*?)$'');
- END IF;
- reducedStreet := substring(fullStreet, ''(?i)^(.*?)'' || ws || ''+''
- || postDir);
- SELECT INTO tempString substring(reducedStreet, ''(?i)(^'' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- IF tempString IS NOT NULL THEN
- preDir := tempString;
- SELECT INTO preDirAbbrev abbrev FROM direction_lookup WHERE
- texticregexeq(reducedStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- streetName := substring(reducedStreet, ''^'' || preDir || ws
- || ''+(.*)'');
- ELSE
- streetName := reducedStreet;
- END IF;
- ELSE
-
- -- There is no street type, directional suffix or internal address
- -- to allow distinction between street name and location.
- IF location IS NULL THEN
- location := location_extract(fullStreet, stateAbbrev);
- END IF;
- -- Check for a direction prefix.
- SELECT INTO tempString substring(fullStreet, ''(?i)(^'' || name
- || '')'' || ws) FROM direction_lookup WHERE
- texticregexeq(fullStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name);
- RAISE NOTICE ''DEBUG 1'';
- IF tempString IS NOT NULL THEN
- preDir := tempString;
- SELECT INTO preDirAbbrev abbrev FROM direction_lookup WHERE
- texticregexeq(fullStreet, ''(?i)(^'' || name || '')'' || ws)
- ORDER BY length(name) DESC;
- IF location IS NOT NULL THEN
- -- The location may still be in the fullStreet, or may
- -- have been removed already
- streetName := substring(fullStreet, ''^'' || preDir || ws
- || ''+(.*?)('' || ws || ''+'' || location || ''|$)'');
- RAISE NOTICE ''DEBUG 2.1 "%", "%"'', streetName, fullStreet;
- ELSE
- streetName := substring(fullStreet, ''^'' || preDir || ws
- || ''+(.*?)'' || ws || ''*'');
- END IF;
- ELSE
- IF location IS NOT NULL THEN
- -- The location may still be in the fullStreet, or may
- -- have been removed already
- streetName := substring(fullStreet, ''^(.*?)('' || ws
- || ''+'' || location || ''|$)'');
- RAISE NOTICE ''DEBUG 2.2 "%", "%"'', streetName, fullStreet;
- ELSE
- streetName := fullStreet;
- END IF;
- END IF;
- END IF;
- END IF;
- END IF;
-
-
-
- RAISE NOTICE ''normalize_address() - final internal "%"'', internal;
- RAISE NOTICE ''normalize_address() - prefix_direction "%"'', preDir;
- RAISE NOTICE ''normalize_address() - street_type "%"'', streetType;
- RAISE NOTICE ''normalize_address() - suffix_direction "%"'', postDir;
- RAISE NOTICE ''normalize_address() - state "%"'', state;
-
- -- This is useful for scripted checking. It returns what was entered
- -- for each field, rather than what should be used by the geocoder.
- --result := cull_null(internal) || '':'' || cull_null(address) || '':''
- --|| cull_null(preDir) || '':'' || cull_null(streetName) || '':''
- --|| cull_null(streetType) || '':'' || cull_null(postDir)
- --|| '':'' || cull_null(location) || '':'' || cull_null(state) || '':''
- --|| cull_null(zip);
-
- -- This is the standardized return.
- result := cull_null(address) || '':'' || cull_null(preDirAbbrev) || '':''
- || cull_null(streetName) || '':'' || cull_null(streetTypeAbbrev) || '':''
- || cull_null(postDirAbbrev) || '':'' || cull_null(location) || '':''
- || cull_null(stateAbbrev) || '':'' || cull_null(zip);
- return result;
-END
-' LANGUAGE plpgsql;
-
-
-
--- rate_attributes(dirpA, dirpB, streetNameA, streetNameB, streetTypeA,
--- streetTypeB, dirsA, dirsB, locationA, locationB)
--- Rates the street based on the given attributes. The locations must be
--- non-null. The other eight values are handled by the other rate_attributes
--- function, so it's requirements must also be met.
-CREATE OR REPLACE FUNCTION rate_attributes(VARCHAR, VARCHAR, VARCHAR, VARCHAR,
- VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS INTEGER
-AS '
-DECLARE
- result INTEGER := 0;
- locationWeight INTEGER := 14;
-BEGIN
- IF $9 IS NOT NULL AND $10 IS NOT NULL THEN
- result := levenshtein_ignore_case($9, $10);
- ELSE
- RAISE EXCEPTION ''rate_attributes() - Location names cannot be null!'';
- END IF;
- result := result + rate_attributes($1, $2, $3, $4, $5, $6, $7, $8);
- RETURN result;
-END;
-' LANGUAGE plpgsql;
-
--- rate_attributes(dirpA, dirpB, streetNameA, streetNameB, streetTypeA,
--- streetTypeB, dirsA, dirsB)
--- Rates the street based on the given attributes. Only streetNames are
--- required. If any others are null (either A or B) they are treated as
--- empty strings.
-CREATE OR REPLACE FUNCTION rate_attributes(VARCHAR, VARCHAR, VARCHAR, VARCHAR,
- VARCHAR, VARCHAR, VARCHAR, VARCHAR) RETURNS INTEGER
-AS '
-DECLARE
- result INTEGER := 0;
- directionWeight INTEGER := 2;
- nameWeight INTEGER := 10;
- typeWeight INTEGER := 5;
-BEGIN
- result := result + levenshtein_ignore_case(cull_null($1), cull_null($2)) *
- directionWeight;
- IF $3 IS NOT NULL AND $4 IS NOT NULL THEN
- result := result + levenshtein_ignore_case($3, $4) * nameWeight;
- ELSE
- RAISE EXCEPTION ''rate_attributes() - Street names cannot be null!'';
- END IF;
- result := result + levenshtein_ignore_case(cull_null($5), cull_null($6)) *
- typeWeight;
- result := result + levenshtein_ignore_case(cull_null($7), cull_null($7)) *
- directionWeight;
- return result;
-END;
-' LANGUAGE plpgsql;
-
-
-
--- state_extract(addressStringLessZipCode)
--- Extracts the state from end of the given string.
---
--- This function uses the state_lookup table to determine which state
--- the input string is indicating. First, an exact match is pursued,
--- and in the event of failure, a word-by-word fuzzy match is attempted.
---
--- The result is the state as given in the input string, and the approved
--- state abbreviation, seperated by a colon.
-CREATE OR REPLACE FUNCTION state_extract(VARCHAR) RETURNS VARCHAR
-AS '
-DECLARE
- tempInt INTEGER;
- tempString VARCHAR;
- rawInput VARCHAR;
- state VARCHAR;
- stateAbbrev VARCHAR;
- result VARCHAR;
- rec RECORD;
- test BOOLEAN;
- ws VARCHAR;
- verbose BOOLEAN := TRUE;
-BEGIN
- IF verbose THEN
- RAISE NOTICE ''state_extract()'';
- END IF;
- IF $1 IS NULL THEN
- RAISE EXCEPTION ''state_extract() - no input'';
- ELSE
- rawInput := $1;
- END IF;
- ws := ''[ ,\.\t\n\f\r]'';
-
- -- Separate out the last word of the state, and use it to compare to
- -- the state lookup table to determine the entire name, as well as the
- -- abbreviation associated with it. The zip code may or may not have
- -- been found.
- tempString := substring(rawInput from ws || ''+([^ ,\.\t\n\f\r0-9]*?)$'');
- SELECT INTO tempInt count(*) FROM (select distinct abbrev from state_lookup
- WHERE upper(abbrev) = upper(tempString)) as blah;
- IF tempInt = 1 THEN
- state := tempString;
- SELECT INTO stateAbbrev abbrev FROM (select distinct abbrev from
- state_lookup WHERE upper(abbrev) = upper(tempString)) as blah;
- ELSE
- SELECT INTO tempInt count(*) FROM state_lookup WHERE upper(name)
- like upper(''%'' || tempString);
- IF tempInt >= 1 THEN
- FOR rec IN SELECT name from state_lookup WHERE upper(name)
- like upper(''%'' || tempString) LOOP
- SELECT INTO test texticregexeq(rawInput, name) FROM state_lookup
- WHERE rec.name = name;
- IF test THEN
- SELECT INTO stateAbbrev abbrev FROM state_lookup
- WHERE rec.name = name;
- state := substring(rawInput, ''(?i)'' || rec.name);
- EXIT;
- END IF;
- END LOOP;
- ELSE
- -- No direct match for state, so perform fuzzy match.
- SELECT INTO tempInt count(*) FROM state_lookup
- WHERE soundex(tempString) = end_soundex(name);
- IF tempInt >= 1 THEN
- FOR rec IN SELECT name, abbrev FROM state_lookup
- WHERE soundex(tempString) = end_soundex(name) LOOP
- tempInt := count_words(rec.name);
- tempString := get_last_words(rawInput, tempInt);
- test := TRUE;
- FOR i IN 1..tempInt LOOP
- IF soundex(split_part(tempString, '' '', i)) !=
- soundex(split_part(rec.name, '' '', i)) THEN
- test := FALSE;
- END IF;
- END LOOP;
- IF test THEN
- state := tempString;
- stateAbbrev := rec.abbrev;
- EXIT;
- END IF;
- END LOOP;
- END IF;
- END IF;
- END IF;
- IF state IS NOT NULL AND stateAbbrev IS NOT NULL THEN
- result := state || '':'' || stateAbbrev;
- END IF;
- return result;
-END;
-' LANGUAGE plpgsql;
More information about the postgis-commits
mailing list