[postgis-users] problem with pgsql2shp and special characters/umlaute

Markus Schaber schabi at logix-tt.com
Wed Jul 5 07:08:26 PDT 2006


Hi, Kathrin,

gis at eifelgeist.com wrote:


> I use PostgeSQL 8.1.3 with POSTGIS 1.0.4 and GEOS 2.1.4. The encoding
> of the database is SQL-ASCII.
> 
> Is it the right encoding of database?

Probably not, as ASCII does not contain any "special characters" like
Umlauts.

Apart from some control characters (like escape, carriage return etc.)
ASCII only contains the following:

  ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g
h i j k l m n o p q r s t u v w x y z { | } ~

To be fully flexible, you should use UTF-8 as database encoding.

UTF-8 is an encoding for the unicode charset, which has been invented
for (and is capable of) storing all weird and not-so-weird characters
you encounter on this planet.

Another advantage is that, for UTF-8 databases, every client can use the
encoding that fits his needs, and PostgreSQL can automatically convert
between the client encoding and UTF-8. (This automatic conversion
currently does not work for arbitrary pairs of encodings, but every
encoding supported by PostgreSQL can be converted from and to UTF-8.)

Usually, PostgreSQL will give error messages when trying to encode a
character that is not encodable in the target encoding. However, with
SQL-Ascii, this does not work for legacy / compatibility reasons, and
PostgreSQL simply copies all 8-bit characters unchanged. Your example
given looks like shp2pgsql used utf-8 encoding to send the names to the
database, and your psql used latin-1 or latin-9 to receive it.

HTH,
Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org



More information about the postgis-users mailing list