Terminal Programming with Python series 2: Terminal Output Sequences
Introduction
A student new to Systems Programming will eventually stumble upon a strange method for printing color in a terminal. They might see something like the Arch Linux Color Bash Prompt guide, introducing a table of shell variables that appear something like a markup language for:
magenta = '\x1b[0;35m' print(magenta + 'magenta is an original primary CGA color.')
In this article we will examine how the magic string \x1b[0;35m may be constructed with some determinism. Then, view the source code of xterm(1) to learn how they are interpreted. Finally, we will look at a class of sequences parsed by xterm(1) that elicit a response from your terminal emulator.
Capabilities
If we review the manual for terminfo(5):
$ man 5 terminfo
We discover a database of terminal capabilities that allows us to construct these special sequence strings. If we dig deeper, we will find a parameterized language for describing terminal capabilities.
The termcap.src file is authored in a special language that helps associate terminals defined by TERM environment value to their terminal capability strings.
With the curses module of python, we can access the C library routines that for the capabilities database defined by terminfo(5):
import curses curses.setupterm() cyan = curses.tparm(curses.tigetstr('setaf'), curses.COLOR_CYAN).decode() print(cyan + 'cyan is a primary CGA color.')
The blessed library provides a much simpler interface:
import blessed term = blessed.Terminal() print(term.red + 'red was not introduced until EGA.')
Although you are welcome to print raw strings directly to the user's terminal, as often recommended by introductory guides, using the terminfo(5) database ensures the correct sequences for the given user's TERM environment value are used. It also allows the Operating System to maintain terminal support independently of your software.
Rendering
The ASCII control character ESCAPE (\x1b) begins a detour to a special processing routine when received by a terminal emulator. The string phrase '\x1b[1;35m' is meaningful as a kind of markup language.
We can read this as a standard polish notation parser: placing arguments onto the stack, then calling the defining function:
- \x1b[ is the Control Sequence Inducer (CSI) sequence,
- followed by parameters 1;31 (Bold, Magenta),
- with final function m for Select Graphics Rendition (SGR).
xterm
Most modern terminal emulators export environment value TERM=xterm, even though their parser is not fully compatible. This marks the behavior and code for xterm(1) as the most principal and correct.
Within a 2,740-line function, doparsing(), we find the application of the color red:
2673 case 31: (...) 2679 case 37: 2680 if_OPT_ISO_COLORS(screen, { 2681 xw->sgr_foreground = (op - 30); 2682 xw->sgr_extended = False; 2683 setExtendedFG(xw); 2684 }); 3685 break;
We can see a fall-through switch statement for the numeric parameter 31 through 37 and setting the foreground color. Similar code can be found in Microsoft's upcoming win32 OpenSSH client.
Interesting and Strange
Now that we have clearly defined the markup language and its acting parser, we have time to discover some interesting sequences we may not have seen before. Some strings, such as the DEC tube alignment test, have no capability name in the terminfo(5) database. In such cases, it is necessary to print these sequences directly.
The DEC tube alignment test sequence causes the screen to fill, a sort of inverse clear screen:
print('\x1b#8')
We also find ways to manipulate our character set, making our output text incomprehensible -- put this in your co-worker's .profile for a holiday laugh:
printf "\x1b(0\x1b)B"
Which reads,
- Designate G0 Character Set as DEC Special Character and Line Drawing,
- Designate G1 Character Set as US-ASCII.
You may have noticed a similar problem occurs as a byproduct when accidentally outputting a binary file directly to the terminal. The reset(1) command may be executed to reset your terminal. Or, you may simply emit the sequence, ESC c to correct your terminal:
printf "\x1bc"
There are several more interesting sequences, the blessed library provides access to many of the common state-changing sequences as context managers:
- hidden_cursor: hides cursor, restoring visibility on exit.
- location: Temporarily move the cursor, restoring original position on exit.
- fullscreen: Switch to secondary screen, restoring primary screen on exit.
- keypad: Enable directional keypad input.
The reader is encouraged to investigate the source code of their preferred terminal emulator and try some of the more interesting capabilities found there.
Reactor
Applications may write hidden messages that change the state of your terminal, but they may also request your terminal emulator to write hidden messages in return!
Let's try one, Report Cursor Position:
$ printf "\x1b[6n"; read input $ set | grep ^input input=$'\E[38;1R'
This is a feature of the blessed library:
import blessed term = blessed.Terminal() print(term.get_location())
There are other sequences that cause a terminal emulator to write a response, some terminals respond to the raw control character, ^E (\x05) with a terminal identifier, such as PuTTY.
Espionage
We can elicit responses of a variety of details about the client through this in-band control channel, and we can temporarily disable echo to ensure it is hidden and collected without the user's knowledge.
Combined with the protocol such as ssh or telnet, we can produce a fingerprint and guess of the client's operating system with a very high confidence value.
Furthermore, we can deduce the round trip time to the distant end's emulator, allowing us to estimate actual time of transmission and receipt of I/O, an important factor in providing responsive interfaces.
whatis.telnet.org
An upcoming project will be an interactive, fingerprinting telnet server. It will produce a private report of all of the details it was able to retrieve, and hosted at telnet address whatis.telnet.org.