Hi!
I upgraded my VDR from 1.5.1 to 1.5.9 yesterday. Its work fine, but I have a question:
In some channels I have the EPG with characters in ISO-8859-2. These chars are replaced to '?'. How can I set EPG charset in VDR? It would be useful an option in setup.conf - EPGCharset = ISO-8859-2
Boguslaw Juza
On 09/01/07 16:52, Boguslaw Juza wrote:
Hi!
I upgraded my VDR from 1.5.1 to 1.5.9 yesterday. Its work fine, but I have a question:
In some channels I have the EPG with characters in ISO-8859-2. These chars are replaced to '?'. How can I set EPG charset in VDR? It would be useful an option in setup.conf - EPGCharset = ISO-8859-2
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
So the only thing you can change is the character set on your system (or at least the shell that runs VDR). If you receive channels that use different kinds of character sets, you may want to use UTF-8. In that case you should see all characters correctly, no matter which character set the broadcatser uses.
Unless, of course, the broadcaster doesn't set the character set encoding correctly according to the DVB standard. The German "Premiere" pay tv provider is notorious for this...
Klaus
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Which function do this conversion and where?
Boguslaw Juza
On 9/1/07, Boguslaw Juza bogdan@uci.agh.edu.pl wrote:
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Which function do this conversion and where?
Why not just "export LANG=" at the beginning of your startup script to set the locale?
BR.
On Sat, 1 Sep 2007, Stone wrote:
On 9/1/07, Boguslaw Juza bogdan@uci.agh.edu.pl wrote:
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Which function do this conversion and where?
Why not just "export LANG=" at the beginning of your startup script to set the locale?
I have set it to pl_PL.ISO-8859-2 . If I'll set it to pl_PL.UTF-8, characters are not displayed correctly. But are not converted to '?' :).
Boguslaw Juza
On 9/1/07, Boguslaw Juza bogdan@uci.agh.edu.pl wrote:
On Sat, 1 Sep 2007, Stone wrote:
On 9/1/07, Boguslaw Juza bogdan@uci.agh.edu.pl wrote:
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Which function do this conversion and where?
Why not just "export LANG=" at the beginning of your startup script to
set
the locale?
I have set it to pl_PL.ISO-8859-2 . If I'll set it to pl_PL.UTF-8, characters are not displayed correctly. But are not converted to '?' :).
When you do export LANG to pl_PL.ISO-8859-2, does VDR say the locale is recognized on startup?
Regards.
On Sat, 1 Sep 2007, Stone wrote:
When you do export LANG to pl_PL.ISO-8859-2, does VDR say the locale is recognized on startup?
I do:
export LC_CTYPE=pl_PL
Its enough - in vdr.c:
... if (setlocale(LC_CTYPE, "")) CodeSet = nl_langinfo(CODESET); ...
it sets CodeSet to "ISO-8859-2". And it is recognized and working :)
Boguslaw Juza
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Well, I found this piece of code and check returned charset tags. It was lots of numbers for this channel - f.e. 109, 90, 54, 82, 52, 74, 49, 83, 80, 86, 49, 65, 85, 83...
So I set:
const char *cs = "ISO-8859-2";
in getCharacterTable() in si.c and its much better :).
Boguslaw Juza
On 09/01/07 21:26, Boguslaw Juza wrote:
On Sat, 1 Sep 2007, Klaus Schmidinger wrote:
The character set is defined in the first byte(s) of the data that is broadcast for each string. VDR uses that information to convert that string to the character set used on your system.
Well, I found this piece of code and check returned charset tags. It was lots of numbers for this channel - f.e. 109, 90, 54, 82, 52, 74, 49, 83, 80, 86, 49, 65, 85, 83...
Those are all normal ASCII characters, which means that the channel does not provide any character set information. This, in turn, means that the strings should be encoded in the default character set, which, according to the DVB standard, is ISO6937.
So I set:
const char *cs = "ISO-8859-2";
in getCharacterTable() in si.c and its much better :).
If the strings are correctly displayed if you set the default to ISO-8859-2, then this means that the broadcaster is not correctly encoding the data.
This may be a viable workaround for you, but you may run into trouble with other channels that encode their data correctly.
I suggest contacting the provider/broadcaster and complaining about their failiure to adhere to the DVB standard.
Well, I've contacted Premiere twice about their wrong encoding, but apparently they don't give a sh*t... Maybe we should start a coordinated effort to pester them long enough until they give up their ignorance ;-)
Klaus Schmidinger