PDA

View Full Version : Multilingual Web Pages - Symbols & Non-English Character Encodings/References, Unicod


William Davis
07-08-06, 11:18 PM
Our site sells Cuban memorabilia. When we first started, we decided on a English Web site versus Spanish because it was our opinion at the time that most queries were performed in that language and that search engines were more efficient in English.

As time progressed our site started to grow with more products, making it impossible to describe many items without doing so in Spanish –like book titles, authors name, etc… So, two years later we implemented a Spanish version of our Web site that contained the same product line. It became very difficult to maintain two sites and the sales from the English version were significantly higher –so we did away with the Spanish version. Since then we enter all of our products items in English –with the exception of those items that require a Spanish descriptions.

Because some of our items require a Spanish description, some words require non-English characters. For example, my last name is really spelled Chávez with an accent mark above the letter a –without it, it is pronounced totally different. Here is another one, año versus ano. The word with the tilde above the letter n means year -without it, it means an--s (pardon me –I was just trying to make a point how important this is).

Questions:

Which Character Encoding should we use on multilingual page(s)? My interpretation is either to use UTF-8 (method preferred by W3.org) or regional encoding (iso-8859-1, windows-1252) for the main language used (English) and then character references for outside encoding non-English characters. It is our understanding this can be implemented by either adding the code “AddType text/html;charset=code.html” (“code” is the character encoding set to be defined) in our .htaccess file for Apache servers, or a different code in the global header. Either way, we welcome sample code suggestions in addition to methodology.
Which Character Reference codes should we use on multilingual page(s) (Hexadecimal, Numeric or Entity)? Note, we use StoreMan, therefore we should use the code (most) supported by all –meaning StoreMan, Miva, Browsers, Search Engines, etc…
Does it make any difference which html editor we use?
Does it make any difference how we would save our files?
Does it make any difference which ftp program we use to upload our files?
Would it be a good idea to have two descriptions on our product pages, one in each language on the same product page for SEO purposes (first paragraph in English and the second in Spanish)? Finally,
What DOCTYPE should we use?, I just noticed we are not using one.This is very confusing.

mvmarkus
07-10-06, 12:57 AM
Because some of our items require a Spanish description, some words require non-English characters. For example, my last name is really spelled Ch?vez with an accent mark above the letter a ?without it, it is pronounced totally different. Here is another one, a?o versus ano. The word with the tilde above the letter n means year -without it, it means an--s (pardon me ?I was just trying to make a point how important this is).


Bruce will probably hate me for this, but I am afraid that the issues that you have may be caused by Access (Excel does that, too), and not by Apache/Empresa or Merchant. The regular accents are not part of Unicode and most language sets/browsers -as far as I remember- do support the standard accents.

To make life easier, did you try to convert them into the encoded equivalents, like &umlaut; instead of ?? This should work everywhere, AFAIK.

Markus

William Davis
07-10-06, 01:30 AM
Actually, the method we used back then -Example á (lower case a with an accent mark) was Alt-160.

We also need to declare a do***ent type. Do you know if HTML transitional is best on Miva pages or XHTML transitional?

mvmarkus
07-10-06, 01:53 AM
Actually, the method we used back then -Example ? (lower case a with an accent mark) was Alt-160.

We also need to declare a do***ent type. Do you know if HTML transitional is best on Miva pages or XHTML transitional?

This certainly depends on the code of your pages. Most of the times I use XHTML transistional, or even "strict", and even if it wasn't really strict, the web police didn't fine me yet.

Markus

(P.S: Do you have a name?)

Bruce - PhosphorMedia
07-10-06, 05:47 AM
nah...we hate Access just as much as any sane person...however, you have to run with what's available<G>.

William Davis
07-10-06, 08:35 PM
My name? Does it not appear at the bottom of my posts? Jerry Chavez with CubaCollectibles.com

As to Bruce, I do not understand what you meant. But I am glad you joined this thread. What do you think would be the best route to take in our particualr case (refer to my initial post)?

1. DOCTYPE HTML or XHTML and Transitional or Strict?
2. Encoding (which one) and character referenace (hex, etc...) that will work with StoreMan Pro.