Skip to main content
Internet Archive's 25th Anniversary Logo

View Post [edit]

Poster: billmoyer Date: Mar 22, 2006 1:39am
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

Hello, LAJ!

My name is Bill Moyer, and I am an engineer at The Archive. Those odd characters are Latin-1 encodings of "smart quotes", as described here:

The baseline computer character set only has three kinds of quotation-like symbols:
single-quote ['] (which doubles as an apostrophe),
double-quote ["]
backtick [`]

Over the years, the baseline set has been expanded in different (incompatible) ways by different standards organizations, so that symbols like "smart quotes" can be represented. For (lame) reasons I won't get into right now, The Archive decided to standardize its website code on the UTF8 character set, while most browsers and word processors generate web documents based on the Latin-1 character set.

What this means is that when a user views a review, our servers tell their browser "Expect UTF8 character encodings in this document", and when that document contains Latin-1 encoded characters, the browser doesn't know what to do, and does some weird implementation-dependent thing. For instance, my browser here at home shows me a slanted-A character followed by a dotted outline of a box. One of my browsers at work shows me two small boxes with numbers in them.

I am making an effort to get our website code to do the right thing, but in the meantime we only have workarounds.

I have software which I can run on our servers which sweeps through users' reviews and item descriptions, finds all Latin-1 encoded characters, and converts them to UTF8 encoded characters. That's one workaround.

You might be able to work around this in your browser. Some browsers have a configuration setting for generating UTF8 characters. This will not make that review look any different, but if you edited the review and replaced the funny characters with UTF8 quotation marks / apostrophes then all would be right with the world (until someone else posted a review using Latin-1).

The simple solution is to just use ["'`], but that's "so 1980's" to most people, and some browsers will automatically detect paired "'s and "helpfully" convert them to Smart Quotes for you.

Let me know ( if you want me to run my de-latinizer software on your reviews (I will be running it on all of The Archive's reviews in a week or two, I hope). Otherwise I leave it up to you. Eventually I hope we will have a permanent solution in place, but I do not know how long that might take -- the engineers responsible for the website are in a different department from mine.

Sorry for the inconvenience,
-- Bill

Reply [edit]

Poster: LAJ Date: Mar 22, 2006 5:29am
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

Thank you very much - I appreciate your help.