Clarity Service Management

Expand all | Collapse all

Convert with pdm_uconv (umlaut) to UTF-8

Jump to Best Answer
  • 1.  Convert with pdm_uconv (umlaut) to UTF-8

    Posted 06-17-2016 07:22 AM

    Hi all,

    I have a problem to convert an extern file (from UNIX) to UTF-8 for deref and loading, because we have in Germany 'Umlauts'.

    The environment: Win R12 R2, SQL, CA ServiceManager R14.1, all in german

    I tryed a lot of types in pdm_uconv for encoding but I got an Error in pdm_load:

         dbload    6704 FATAL    tokfile.c    537  Invalid    character encountered in input

    or in pdm_uconv:

         Conversion to Unicode from codepage failed at input byte position 280. Bytes: f6 Error: Illegal character found

    My question is:

    How can I determine the correct codepage ?

     

    I use (for examle)  pdm_uconv:  pdm_uconv -f ibm-850_P100-1995 -t UTF-8 -o __load_cnt_5.txt.test.txt  __load_cnt_5.txt

     

     

    Does anyone have a correct sample???

     

    Thank you  for your help


  • 2.  Re: Convert with pdm_uconv (umlaut) to UTF-8
    Best Answer

    Posted 06-17-2016 11:16 AM
    Because I have found no administrator, who can give me information about the codepages I wrote a small perl script:

     

     

    #!/usr/bin/perl -w

    $ahd_bin    = "d:/CA/Servic~1/bin";

    $my_path     = "d:/CA/Servic~1/site/mods/TEST_Imp";

    $codepage    = $my_path."/"."test_codepage.txt";

    $import        = $my_path."/"."test__load_cnt_5.txt";        # for test:only 1 record in pdm_load format with last_name="ÄÖÜäöüß" 

                                                            # because: look for in notepad++

                                                            # in test__error*.txt: last_name: [Ä

    $convert    = $my_path."/"."test__convert";

    $error_dat    = $my_path."/"."test__error";

    system("$ahd_bin\\pdm_uconv -L \> $codepage");

    open(CODEP,"<$codepage") || die "File '$codepage' is not readable!";

    foreach $rec(<CODEP>) {

        chomp($rec);

        @arr = split(/ /,$rec);                       

        $count=$#arr +1;

        for ($i=0;$i<=$count;$i++) {

            $code_p = $arr[$i];

            $convert_n = $convert   . "__$code_p" . ".txt";

            $error_msg = $error_dat . "__$code_p" . ".txt";

            system("$ahd_bin\\pdm_uconv -f $code_p -t UTF-8 -o $convert_n  $import");

            system("$ahd_bin\\pdm_load -v -f $convert_n 2\> $error_msg");

        }

    }

     

    Now you can use notepad++ to look for the umlauts in the results or for example in the 'test__error...' for update:1

    Not all of the codepages convert the file correct for pdm_load.

    Now my import scripts are running!!



  • 3.  Re: Convert with pdm_uconv (umlaut) to UTF-8

    Posted 06-20-2016 03:37 AM

    Love it when one answers their own question