Japanese char set (OpenInsight)
At 17 OCT 2001 09:01:28PM Jim Vaughan wrote:
Does the new 32-bit stuff have any support for the Japanese character set?
If so would it support this character set in the menus, forms and data?
At 18 OCT 2001 07:28AM Mike Ruane wrote:
Jim-
We're looking into it- as well as Chinese. One of the problems is that we don't speak Chinese or Japanese and expect trouble installing those versions of Windows.
Mike
At 18 OCT 2001 01:32PM j Vaughan wrote:
What kind of time frame are we looking at?
At 19 OCT 2001 10:47PM Jim Vaughan wrote:
I know it's hard to guess how long something like this might take, but … I need to know. We have a customer in Japan that would like to buy but needs the Japanese char set.
Give me a best case worst case. If you think it can be done it will take from…. to….
Thanks.
At 22 OCT 2001 07:18AM Mike Ruane wrote:
Jim-
I have a new machine I can test it on, and someone who can help me get it installed. I should have some more details by next week.
Mike
At 22 OCT 2001 04:22PM j Vaughan wrote:
You guys are great.
I look forward to hearing how it goes.
At 29 OCT 2001 01:05PM Jim Vaughan wrote:
I just heard from my customer, they are meeting next week.
Would it possible to know if this is gaoing to be available by then?
At 29 OCT 2001 02:26PM Mike Ruane wrote:
Jim-
We're formatting the machine today.
Mike
At 29 OCT 2001 03:29PM Jim Vaughan wrote:
Great, keep me updated.
At 29 OCT 2001 04:25PM Steve Epstein wrote:
Dear Jim and Mike,
I have asked the same question.
I actually have a Japanese WIN2000 machine from our clients in Japan. Any testing I can do would be appreciated. I have the fonts, et al.
Steve
At 29 OCT 2001 05:24PM Mike Ruane wrote:
Guys-
Thanks-
First blush seems to be a no, as we need Unicode, which would destroy our data since we make heavy use of Ascii 251 to 255 as our system delimiters.
MIke
At 30 OCT 2001 10:27AM Jim Vaughan wrote:
So what does that mean, do you have any other avenues to pursue?
At 30 OCT 2001 06:25PM Oystein Reigem wrote:
That must be the next big project. After the 32-bit version. To rid OI of those troublesome delimiters.
Just trying to make myself popular.
- Oystein -
At 04 NOV 2001 03:57PM j Vaughan wrote:
So this is no, for now? Or no forever?
If it's no for now, when in the future might it be available.
I just need to give my customer an answer, even if it's one they don't like.
At 05 NOV 2001 06:42AM Oystein Reigem wrote:
Mike,
It would be nice if Unicode could be implemented in OpenInsight and kill dead the international-characters-versus-delimiters problem. But there are many questions on the way. I assume you've looked at some of them already.
There are many different Unicode encoding formats. Some of them are fixed-length (1, 2, 3 or 4 bytes per character), some variable (characters with a mix of different lengths).
I believe there are two basic alternatives if one wants to implement a multi-byte character encoding system in a database system like OpenInsight, where special characters or byte values are used to delimit various units of data during storage and computing.
One is to use a fixed-length character encoding format and let the delimiters be multi-byte too. This means among other things that the file system must be rewritten to handle multi-byte characters instead of single-byte characters. I don't expect that can be done overnight.
The other is as much as possible to handle multi-byte encoded text as any other byte sequence, and keep the old single-byte delimiters. But then one must choose an encoding format that avoids collisions with the delimiters. E.g with a 2-byte encoding format, none of the 2 bytes must ever be in the range 250-255.
But is the latter possible? Is there a Unicode encoding format (e.g one that can be used for Japanese) where no byte is in the range 250-255? I believe no.
But there are formats where certain other byte values never occur. E.g, the UTF-8 2-, 3- and 4-byte encodings always have byte values with the highest bit set to 1 (to distingush them from the single-byte UTF-8 encoding, which is plain old 7-bit ASCII). So perhaps by using that old trick with the bi-directional CHARMAP it's possible after all? E.g, shunt 250-255 down by 128.
Next question is how comparisons and sorting can be done on multi-byte data.
- Oystein -
PS. I don't know that much about Unicode.
But I have colleagues who know a bit more.
And there's the Unicode website .