Generation of XML files in UTF-8 format

Hi,

I'm trying to generate XML files in UTF-8 format using progress. The purpose of this is to display hebrew characters.
The current code page is "ISO8859-1". The only codepages which have collation to hebrew seem to be "ISO8859-8" and "IBM862".
Is there any way to convert codepage from "ISO8859-1" to either "ISO8859-8" or "IBM862"? Or is a code page conversion from "ISO8859-1" to "UTF-8" sufficient?
Please advise.

Thanks.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
alonb's picture

it doesn't matter if you're

it doesn't matter if you're working with xml dom, sax or regular output to, put etc. file streaming commands.

if you have a character in codepage utf8 that cannot be mapped to a character in the codepage you're using, it cannot be converted.

for example: if you have an xml file with codepage utf8 with a text node that contains the letter alef and the progress internal codepage is iso8859-1 (latin codepage) because there is no alef in iso8859-1 that character cannot be converted.

if you're using xml dom you'll get chr(26) or chr(127) in these cases (depending on if you're using chui or gui) which are both illegal xml characters and chr(127) is an illegal utf8 character and will corrupt your document.

the bottom line is that you'll need to escape (without using xml entities) the characters that cannot be converted when reading the xml file and convert them back when saving.

we support exactly these cases in the slibooxml project and we have customers in europe, the states and even israel with just these cases.
http://www.oehive.org/project/libooxml

we also have extensive experience with bi-directional algorithm (bidi) which might be relevant if you also have chui clients in your setup and supporting hebrew.

if you're in israel, you can call me locally on 054-2188086. alon


CODEPAGE-CONVERT

Hi Sharon

Here is an example of how to use CODEPAGE-CONVERT.
What we usually do is write the XML to a longchar, then convert the longchar to the new codepage, then write to file.

DEFINE VARIABLE cp850string AS CHARACTER NO-UNDO INITIAL "text with umlaut (ä)".
DEFINE VARIABLE charsetstring AS CHARACTER NO-UNDO.

charsetstring = CODEPAGE-CONVERT(cp850string, SESSION:CHARSET, "ibm850").

FOR EACH Item NO-LOCK:
IF LOOKUP(charsetstring, Item.CatDescription) > 0
THEN
DISPLAY Item.ItemName.
END.


CONVERT TARGET <codepage> Source session:charset

Hi Ryno,

Thanks for the reply. This works great when we use it with the display statement. However, the value needs to be output to a xml file.

I am using the "OUTPUT TO CONVERT TARGET Source session:charset" command. And the codepage used is "UTF-8". This is not working.

I tried using codepage as "ibm850" but this doesn't seem to work too.

When the XML is opened using Internet explorer, it throws an error of "Invalid character".

Any idea on how to do this?

Thanks.


If its from a ProDataSet or

If its from a ProDataSet or Temptable, you can use the WRITE-XML method and write it to a longchar variable, then user COPY-LOB:

COPY-LOB FROM lcFileContent TO FILE cFileName CONVERT TARGET CODEPAGE "ISO8859-8":U NO-ERROR.


I see you can also set the

I see you can also set the encoding on the WRITE-XML method, so you can convert straight to "ISO8859-8" :)


Generation of XML files in UTF-8 format

Hi,
Thanks for the reply. We are using Progress version "9.1E", so we cannot use any of the methods. We just use a "PUT" statement and output the values into the file. Any idea on how to do it this way?
Thanks.


What is your e-mail?

It is possible to generate XML in Progress 9.1D.