Character maps – Crunch CRiSP File Editor 6 User Manual

Page 37

Advertising
background image

Page 37

All undo information is undoable itself. This is termed redo.

Character Maps

A character map is a way of viewing data through a window. Character maps are used to implement the
'view' and 'literal' macros.

A character map is simply an array of strings corresponding to each of the possible 256 byte values. When a
particular byte is to be displayed in a window, the corresponding string is displayed. For example, when the
hexadecimal character 0x41 is to be displayed, the letter 'A' appears on the window. This is configurable,
and if the user wants to use a non-ASCII character set, then the character map can be modified so that the
EBCDIC entry for 'A' is used instead.

The 8-bit character set has four important regions:

Characters 0x00..0x1f (the control characters)

Characters 0x20..0x7e (the printable ASCII character set).

Character 0x7f (ASCII DEL)

Characters 0x80..0xff

All terminals support the ability to display the ASCII printable characters but few terminals provide facilities
to display the other characters. In vi(1), for instance, control characters are displayed as a caret followed by
the non-control character. These characters take up two character positions on the screen.

CRiSP supports terminals which can print all 8-bit characters, e.g. PC terminals amongst others. Each user
may have a preference for how these non-printing characters are to be printed, e.g. whether in C-style octal
notation, hexadecimal notation, or maybe in the DEC editor EDT style (using keywords inside angle
brackets, e.g. <TAB>).

Even if a user is happy with some means of displaying characters, he/she may find this scheme inflexible
sometimes. For example, if the terminal does not support a full 8-bit character set, then control-A appears as
'^A'. When viewing a binary file, it may be better to have each character occupying the same width on the
screen so that it is easier to look at character offsets.

All this functionality is supported by the character maps. (For real examples of using the character maps,
refer to the view.cr macro).

CRISP supports an arbitrary number of character maps. Initially CRiSP starts off with a default character
map, where the non-printing ASCII characters are dependent on whether the terminal supports 8-bit
characters or not. If not, control characters print in up-arrow format, and the characters with the top bit set
print in the "\x" style notation.

Character maps are created via the create_char_map() primitive. This primitive is supplied with a list of
strings, normally corresponding to each ASCII character position, i.e. a list of 256 strings should be specified
to remap the entire display character set. Any characters not defined are inherited from the base character
map. Character maps are given identifiers which are used with the other character map primitives.

Each character may have associated with it a flag. These flags are used to handle the TAB and
BACKSPACE characters. It is not sufficient to map TABs to some fixed output string -- the tab character
may need to be translated to a variable number of spaces. Likewise, in ANSI mode, the BACKSPACE
character may need interpreting to implement bolding and underlining, e.g. when viewing the output of the
'man' command. The character map system is so generic that any character can be defined to be a TAB on
display (even more than one at the same time).

Character maps provide a view onto a buffer, via a window. It is important to understand what character
maps are doing because you can get into strange corners if they are not treated with respect.

Consider the case where you want to look at a binary file, e.g. a Unix directory. Unix directories are like
normal files but include text (filenames) and binary data (inode numbers). Looking at a directory in pure text
mode is untidy because of the variable width nature of the binary character set. Looking at a directory in
pure binary mode makes viewing the filenames painful. What you can do is create two windows onto the
same buffer -- one displaying in binary (type 'view hex'), and another viewing the file in ASCII (normal
mode).

Each window and buffer has a character map associated with it. By default buffers have no character map

Advertising