Convert with pdm_uconv (umlaut) to UTF-8

Back to discussions

Expand all | Collapse all

Convert with pdm_uconv (umlaut) to UTF-8

Jump to Best Answer

1. Convert with pdm_uconv (umlaut) to UTF-8

0 Recommend
boho
Posted Jun 17, 2016 07:22 AM

Reply Reply Privately
Hi all,
I have a problem to convert an extern file (from UNIX) to UTF-8 for deref and loading, because we have in Germany 'Umlauts'.
The environment: Win R12 R2, SQL, CA ServiceManager R14.1, all in german
I tryed a lot of types in pdm_uconv for encoding but I got an Error in pdm_load:
dbload 6704 FATAL tokfile.c 537 Invalid character encountered in input
or in pdm_uconv:
Conversion to Unicode from codepage failed at input byte position 280. Bytes: f6 Error: Illegal character found
My question is:
How can I determine the correct codepage ?

I use (for examle) pdm_uconv: pdm_uconv -f ibm-850_P100-1995 -t UTF-8 -o __load_cnt_5.txt.test.txt __load_cnt_5.txt

Does anyone have a correct sample???

Thank you for your help
2. Re: Convert with pdm_uconv (umlaut) to UTF-8
Best Answer

1 Recommend
boho
Posted Jun 17, 2016 11:16 AM

Reply Reply Privately
Because I have found no administrator, who can give me information about the codepages I wrote a small perl script:

#!/usr/bin/perl -w
$ahd_bin    = "d:/CA/Servic~1/bin";
$my_path     = "d:/CA/Servic~1/site/mods/TEST_Imp";
$codepage    = $my_path."/"."test_codepage.txt";
$import        = $my_path."/"."test__load_cnt_5.txt";        # for test:only 1 record in pdm_load format with last_name="ÄÖÜäöüß"
                                                        # because: look for in notepad++
                                                        # in test__error*.txt: last_name: [Ä
$convert    = $my_path."/"."test__convert";
$error_dat    = $my_path."/"."test__error";
system("$ahd_bin\\pdm_uconv -L \> $codepage");
open(CODEP,"<$codepage") || die "File '$codepage' is not readable!";
foreach $rec(<CODEP>) {
    chomp($rec);
    @arr = split(/ /,$rec);
    $count=$#arr +1;
    for ($i=0;$i<=$count;$i++) {
        $code_p = $arr[$i];
        $convert_n = $convert   . "__$code_p" . ".txt";
        $error_msg = $error_dat . "__$code_p" . ".txt";
        system("$ahd_bin\\pdm_uconv -f $code_p -t UTF-8 -o $convert_n $import");
        system("$ahd_bin\\pdm_load -v -f $convert_n 2\> $error_msg");
    }
}

Now you can use notepad++ to look for the umlauts in the results or for example in the 'test__error...' for update:1
Not all of the codepages convert the file correct for pdm_load.
Now my import scripts are running!!
3. Re: Convert with pdm_uconv (umlaut) to UTF-8

0 Recommend
Legacy User
Posted Jun 20, 2016 03:37 AM

Reply Reply Privately
Love it when one answers their own question

CA Service Management

Convert with pdm_uconv (umlaut) to UTF-8

bohoJun 17, 2016 07:22 AM

bohoJun 17, 2016 11:16 AMBest Answer

Legacy UserJun 20, 2016 03:37 AM

1. Convert with pdm_uconv (umlaut) to UTF-8

2. Re: Convert with pdm_uconv (umlaut) to UTF-8 Best Answer

3. Re: Convert with pdm_uconv (umlaut) to UTF-8

2. Re: Convert with pdm_uconv (umlaut) to UTF-8
Best Answer