[postgis-users] Re: [Plr-general] Tutorial on PLR and PostGIS, more on carriage returns

Paul Ramsey pramsey at refractions.net
Thu Jun 21 14:50:02 PDT 2007


Steve,

You're right and I'm wrong, I was confused by the UTF code numbers,  
which differ from the actual byte encodings used for UTF8.  Indeed,  
all the multi-byte higher-order stuff is stuffed into 128-255 in the  
UTF8 encoding, so a straight byte-swap would work (for UTF8 and the  
various one-byte latin code pages, that is).

Paul

On 21-Jun-07, at 10:30 AM, Stephen Woodbridge wrote:

> Hmmmm, I am probably wrong on this but I thought 0x0 - 0x7f are  
> standard UTF8 characters with a constant meaning that is the same  
> as ascii for those bytes, and the all multi-byte characters had to  
> have a the highorder bit set to indicate is was part of a multibyte  
> sequence.
>
> I was not under the impresion that at you could have 0x0 - 0x7f as  
> a part of a multi-byte sequence. I am not an expert in this area  
> and probably just know enough to mislead you ;) but I think it is  
> worthwhile getting some additional inside into this. I for one  
> would like to see a multi-byte UTF8 sequence with \r embedded in it.
>
> -Steve
>
>
> Paul Ramsey wrote:
>> Danger, will Robinson.  All values are fair game in bytes 2,3,4 of  
>> the UTF encodings, so yes, it's possible you'll wreck multi-byte  
>> characters by doing a simple replacement on the byte array.   
>> Better to use an encoding-aware string replace function (not  
>> knowing C, I don't know what that would be, but there must be some  
>> in the PgSQL code base).
>> P
>> On 21-Jun-07, at 7:03 AM, Joe Conway wrote:
>>> Obe, Regina wrote:
>>>> Joe,
>>>>  Can you take a look at it again.  It was messed up in my  
>>>> firefox too.  I think originally I had it looking right in  
>>>> Firefox, but then IE it didn't look right so I changed it to  
>>>> look right in IE, but forgot to check back in firefox.   
>>>> Hopefully this time I have made all browser masters happy.
>>>
>>> http://www.bostongis.com/PrinterFriendly.aspx? 
>>> content_name=postgresql_plr_tut02
>>> The tutorial looks perfect now in Firefox on Fedora Core 7.
>>>
>>> BTW, I have confirmed on the R-devel list that the R engine is  
>>> expecting \n for EOL, and \r will cause a syntax error, on all  
>>> platforms. I will probably fix this by simply replacing \r with  
>>> \n in PL/R functions. My only reservation is whether this might  
>>> cause issues for installations with multibyte characters. Does  
>>> anyone know if it is possible for multibyte characters to include  
>>> a byte = 13 (\r), i.e. is the simple replacement of \r safe in  
>>> all locales?
>>>
>>> Thanks,
>>>
>>> Joe
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list