Locales + Encodings: The Final Chapter

Hopefully this has been the last time I had such trouble with encodings. I think I finally managed to understand what caused all the troubles and how to avoid them in the future.

  1. dpkg-reconfigure locales (mine are 75-78, notably de_DE.utf-8)
  2. export LANG=de_DE.utf-8
  3. “file” actually checks iso/utf-8 files correctly (otherwise use hexdumps and encoding manpages!)
  4. Read in java files with encoding set correctly! (use iconv to convert strange input files to something java understands)
  5. Write out java files with encoding set correctly, yes force it! Between 4 and 5 you can have full control over conversion of encodings
  6. Yes there are java io classes that let you set encoding
  7. Configure database driver correctly. jdbc.encoding=utf-8 (it does not matter database doesn’t know utf-8)
  8. Use the same encoding whereever you can!!!
  9. Configure your shell+terminalprogram to display encoding correctly (see 1)
  10. Remember: utf-8 uses multibyte characters that’s the strange button in your terminal program
  11. Don’t forget to configure log4j to write utf-8 encoded files (really helps with debugging)
  12. Set your eclipse (or netbeans or what) to utf-8 encoding
  13. Configure your ant file so that your source-code is read as utf-8 in case you have utf-8 Strings in your classes …
  14. Don’t forget to write your encoding inside your xml header in your xml files.
  15. If you don’ like utf-8 substitute with your favorite encoding.
  16. MOST IMPORTANT: use the same encoding everywhere for your stuff. If something fails later, you know it’s not your stuff.
  17. Sometimes check your stuff on a different machine without any locales set just so you know you managed to program your java independent from your local settings!
  18. Aehm: without thorough tests you are doomed! And listen when your
    tests scream: ERROR!
  19. Believe me when I say: ISO is not enough!

That’s the stuff I learned from all my troubles with encodings. Any questions? I think I may be able to answer a few at least where it concerns java and shell configuration šŸ˜‰