Special european characters fix

For articles specific to version 5.x
John Snow
Posts: 12
Joined: Tue Feb 23, 2010 8:01 pm

Special european characters fix

Post by John Snow »

Hi guys,

since folks around here helped me a lot with my own problems, I want to contribute. I hope my solution will help someone, so here it is:

Middle european countries have a set of special characters like wedge above s: "š" or č, etc. Firefox will show these characters in the address bar correctly, but wait until you hit rock bottom with Internet Explorer (as always). In IE, these special characters are URL-escaped and it looks like binary madness in your address bar. There goes SEO and all your effort.

Needless to say I needed to fix this ugly issue. So I opened lib/general.php and after line 134 added following code:

Code: Select all

/*	FIX
	Corrects SEO url for mid-european language characters conversion, e.g.: š => s, č => č, etc. Uses static substitution table.
*/	

static $tbl = array("\xc3\xa1"=>"a","\xc3\xa4"=>"a","\xc4\x8d"=>"c","\xc4\x8f"=>"d","\xc3\xa9"=>"e","\xc4\x9b"=>"e","\xc3\xad"=>"i","\xc4\xbe"=>"l","\xc4\xba"=>"l","\xc5\x88"=>"n","\xc3\xb3"=>"o","\xc3\xb6"=>"o","\xc5\x91"=>"o","\xc3\xb4"=>"o","\xc5\x99"=>"r","\xc5\x95"=>"r","\xc5\xa1"=>"s","\xc5\xa5"=>"t","\xc3\xba"=>"u","\xc5\xaf"=>"u","\xc3\xbc"=>"u","\xc5\xb1"=>"u","\xc3\xbd"=>"y","\xc5\xbe"=>"z","\xc3\x81"=>"A","\xc3\x84"=>"A","\xc4\x8c"=>"C","\xc4\x8e"=>"D","\xc3\x89"=>"E","\xc4\x9a"=>"E","\xc3\x8d"=>"I","\xc4\xbd"=>"L","\xc4\xb9"=>"L","\xc5\x87"=>"N","\xc3\x93"=>"O","\xc3\x96"=>"O","\xc5\x90"=>"O","\xc3\x94"=>"O","\xc5\x98"=>"R","\xc5\x94"=>"R","\xc5\xa0"=>"S","\xc5\xa4"=>"T","\xc3\x9a"=>"U","\xc5\xae"=>"U","\xc3\x9c"=>"U","\xc5\xb0"=>"U","\xc3\x9d"=>"Y","\xc5\xbd"=>"Z");
$val = strtr($val, $tbl);
$val = strtolower($val);
/*
	END OF FIX
*/
After quick CTRL-F5 in my browser, everything was OK and SEO addresses went OK. However, this has only one drawback:
function MakeURLNormal($val) seems to do this exact thing but in reverse order. I did not alter this function as I did not notice any error or problem to do so. If someone thinks fn MakeURLNormal needs to be adjusted as well, feel free to discuss or post your working code here.

Spread the word and help others fight such issues. Long live community help ;)
If you are interested in my posts and want a reply, please DO USE private messages. I do not narcissistically check my posts.
Martin
Site Admin
Site Admin
Posts: 1854
Joined: Wed Jun 17, 2009 6:30 pm
Location: South Yorkshire UK
Contact:

Re: Special european characters fix

Post by Martin »

Always good to see a little pay-it-forward so thanks for that...

I haven't checked it over but it sounds like it could be useful to those with foreign products or customer bases... Cheers.. :)
CharlieFoxtrot
Confirmed
Confirmed
Posts: 413
Joined: Sun Aug 09, 2009 1:23 pm

Re: Special european characters fix

Post by CharlieFoxtrot »

Wow! Nice work!
ISC 4.0.7

"... and let's be honest that whole "by design" thing is getting old too."
John Snow
Posts: 12
Joined: Tue Feb 23, 2010 8:01 pm

Re: Special european characters fix

Post by John Snow »

Glad to be of a help guys ;)
If you are interested in my posts and want a reply, please DO USE private messages. I do not narcissistically check my posts.
meules
Confirmed
Confirmed
Posts: 95
Joined: Wed Jun 17, 2009 8:56 pm
Location: NL

Re: Special european characters fix

Post by meules »

Exactly what I need... thx ;)
ISC v6
Tony Barnes
Posts: 744
Joined: Thu Jun 18, 2009 8:59 am

Re: Special european characters fix

Post by Tony Barnes »

Can this be adapted to get rid of other non letters such as hyphens and apostrophies also? Instead of getting all that 252%D crap..???
Martin
Site Admin
Site Admin
Posts: 1854
Joined: Wed Jun 17, 2009 6:30 pm
Location: South Yorkshire UK
Contact:

Re: Special european characters fix

Post by Martin »

Tony Barnes wrote:Can this be adapted to get rid of other non letters such as hyphens and apostrophies also? Instead of getting all that 252%D crap..???
If you want to get rid of "that crap" you need to come up with a different character substitution for dashes, etc... If you don't you end up with broken URLs because the SEO code uses dashes as a sub for spaces...

Personally I'd probably go with underscores to sub for spaces, then leave dashes alone but I have absolutely no idea what that does for SEO...
Tony Barnes
Posts: 744
Joined: Thu Jun 18, 2009 8:59 am

Re: Special european characters fix

Post by Tony Barnes »

I'd predominantly just want to get rid of hyphens, as they are the big headache for us - e.g.
Great product - with free product
becomes -
if it simply ignored them, or even replaced them with a hyphen (!!), then it would be:
which makes a lot more sense!!
Martin
Site Admin
Site Admin
Posts: 1854
Joined: Wed Jun 17, 2009 6:30 pm
Location: South Yorkshire UK
Contact:

Re: Special european characters fix

Post by Martin »

Tony Barnes wrote:I'd predominantly just want to get rid of hyphens, as they are the big headache for us - e.g.
Great product - with free product
becomes -
if it simply ignored them, or even replaced them with a hyphen (!!), then it would be:
which makes a lot more sense!!
True... might be possible but the problem is how would the system know you weren't listing:
  • Great product - with free product
  • Great product with free product
  • Great product --with free product
  • Great product-- with free product

Thinking about it logically what they should have in the code is an SEO translated field in the products table so the SEO isn't translated on the fly but stored with the product record and used directly.. That would allow direct translation and the sort of format you're after.

BUT, adding that in now would require a lot more sanity checking to ensure you didn't create duplicates...
Tony Barnes
Posts: 744
Joined: Thu Jun 18, 2009 8:59 am

Re: Special european characters fix

Post by Tony Barnes »

Hmmm, I guess - but it's still a PITA!

As a hack there aren't going to be many products that are "that" similar, so would be an ok workaround I reckon - or are you saying that it would need to be unique to work "backwards" (i.e. cart interpreting an entered URL)?
Post Reply