CHARACTER SETTINGS QUICK HOWTO

Contributed by Ivan Kulemzin <ivk@kristal.ru>, 8 Oct 2002
Edited by Peter Samuelson, 6 Dec 2002

How to setting up Samba-TNG for correct working with
internationalized characters:

There are four parameters in smb.conf for this.

  client code page:
    This parameter specifies the DOS code page that the clients
    accessing Samba are using.  It is a 3- or 4-digit number.  For the
    list of valid code pages, see your code page directory
    (site-dependent, default is /usr/local/samba/lib/codepages) for
    files named "codepage.NNN".  Every NNN is a valid code page.

  character set:
    This allows a smbd to map incoming filenames from a DOS Code page
    to several built in UNIX character sets.  Valid values are
    "iso8859-1", "iso8859-2", "iso8859-5", "iso8859-7", "iso8859-8",
    "iso8859-9", "iso8859-13", "iso8859-15", "koi8-r", "koi8-u",
    "roman8", "1251", and "1251u".

Those two are used only for filenames.  They will eventually be phased
out in favor of the following:

  dos charset:
    DOS SMB clients assume the server has the same charset as they do.
    This option specifies which charset Samba should talk to DOS
    clients.

  unix charset:
    Specifies the charset the unix machine Samba runs one uses. Samba
    needs to know this in order to be able to convert text to the
    characters other SMB clients use.

These are used for non-filename strings, such as the "Full Name" field
in User Manager for Domains.  Valid values are any character set
recognised by `iconv'; to get a nice long list, type `iconv -l'.


Example:
  We have FreeBSD 4.6 with Samba-TNG installed on it. And some
  Windows 9x/NT/2000 russian clients. We needs correct russian
  filenames on FreeBSD and on clients. In FreeBSD we uses KOI8-R
  code page for russian names of files and we sets
    character set = KOI8-R
  in our smb.conf file. Windows use DOS page 866 for russian
  filenames:
    client code page = 866

  But we also want to see russian comments for our users or groups.

    dos charset = 866
    unix charset = KOI8-R

Unless you know what you're doing, the two sets of parameters should be
set to the same values.

At the time of writing, some of the conversion code that uses "dos
charset" and "unix charset" is unfinished.
