Wednesday, July 29, 2015

GNU nano on Windows with UTF-8 and color, howto

Ever since the day I first installed the Debian GNU/Linux distribution, I've been using their default text editor, called GNU nano.

I quite like GNU nano. It can edit multiple files at once (combined with grep, find, or a shell glob); it has syntax highlighting; it can undo (as well as redo); etc. Plus, it's extremely simple to use!

For general use on Windows, IMO the best text editor is Notepad2. Yet when I'm developing software, GNU nano is my editor of choice—even when I'm (forced to be) on Windows—for instance, in Git Bash (Git for Windows).

More broadly, if GNU nano is installed on Windows, it might form a good "bridge" to help Windows users adapt to the UNIX command line (in advance, partially—if they will ever need it). Then (at least) the editor will be familiar. This IMO is the first hurdle.

But first, a little history. Elm (etymology: ELectronic Mail) is a text-based email client. Pine (etym.: the word itself, or Pine Is Nearly Elm) is a text-based email client developed at the University of Washington, where evergreen trees abound. Pico (etym.: PIne COmposer) is a text editor. GNU nano (etym.: a word similar to pico in meaning, or Nano’s ANOther editor) is a free software rewrite of pico which includes many improvements. All of this software is for Unix.

Currently, GNU nano's website says its latest Windows build is version 2.5.3 (of GNU nano). As the download directory informs us, this was last modified on 2016-02-25. Its README.TXT says 'this version of nano for Win32 systems was compiled using...cygwin and PDCurses 2.4.'

Also, however, this GNU nano build was compiled with option '--disable-utf8' (as the command 'nano --version' reveals).

At present, Windows (even as far back as XP SP3) is pretty thoroughgoing in its use of UTF-8. So, because some of my projects need UTF-8, I decided to look into how I could work around this problem.

Ultimately, I extracted nano.exe from a recent version (at that time, 2.1.0-1) of Cygwin. That, as well as copying just a few other files from Cygwin, created a standalone GNU nano which works fine (without the bulk of Cygwin) on my Windows box—and it displays UTF-8. (BTW, I use several Windows computers; I haven't installed Cygwin on all of them.)

Note: you must set the following Windows environment variable (but you can change the 'en_US' part if you like):
  • LANG   :   en_US.UTF-8
You must unset the following Windows environment variable (unless it happens already to have the value 'cygwin'):
  • TERM
If you like, you can do this in a Windows 'batch' file, which then starts GNU nano.

EDIT: Previously, I had suggested setting Windows environment variables HOME and TERM (partly in line with the GNU nano editor's recommendations) but when unconventionally set, these can conflict with other software, notably Ruby.

Typically, when adapted to Windows, most software will calculate the correct 'home' location of user data, but will defer to our choice if we ourselves set the HOME variable. Sometimes that creates problems among the different console-like programs (such as cmd.exe, ConEmu, Git for Windows, Cygwin, etc., especially with their different pathname string formats for the root of a disk drive). One seemingly working value for the HOME variable is %USERPROFILE%.

Also, some software may require the Windows environment variable TERM to be set to some particular value other than 'cygwin'.

The necessary files copied from Cygwin are just:
  • bin\cyggcc_s-1.dll
  • bin\cygiconv-2.dll
  • bin\cygintl-8.dll
  • bin\cygmagic-1.dll
  • bin\cygncursesw-10.dll
  • bin\cygwin1.dll
  • bin\cygz.dll
  • bin\nano.exe
  • etc\nanorc
  • lib\terminfo
  • usr\share\misc\magic
  • usr\share\misc\magic.mgc
  • usr\share\terminfo\63\cygwin
Place these in the same-named folders under your 'nano' program folder. Create an additional folder (in the same place):
  • home\%USERNAME%\
and enjoy.

If you want syntax highlighting, pick up all the syntax highlighting files as well:
  • usr\share\nano\*.nanorc
Although you might run it from cmd.exe, GNU nano can do color-highlighting of syntax nevertheless. To do this, edit etc\nanorc in the following way, but keep its line endings as LF only.

Uncomment any (or all) of that configuration file's lines containing 'include ...' in order to make it pull in the various files above, which you desire. They define the particular sets of syntax which GNU nano will highlight.

The extension of the file you're editing usually determines which '*.nanorc' syntax file GNU nano will apply. (A regular expression atop each syntax file determines this.)

To highlight trailing spaces (often disliked by Git—the version control software), you might find it useful to append usr\share\nano\default.nanorc (and any other syntax-highlighting file you choose) with:

      # Trailing whitespace
      color ,blue "[[:space:]]+$"

To highlight the following as well, add:

      # Spaces in front of tabs
      color ,red " + +"

If you want to make all tabs and spaces visible whenever you type Alt-P, then uncomment and change the 'set whitespace' line in etc\nanorc to:

      set whitespace "»·"

Also, Lubomir I. Ivanov reports he has made a build for Windows of GNU nano 2.2.6. In addition to '--enable-utf8', he compiled with '--enable-extra' obtaining GNU nano's experimental features such as undo. He added a customized Windows command console as well, so his build may be even better.

Copyright (c) 2015 Mark D. Blackwell.