Terminal Programming with Python series 1: Automation and pty(4)
Introduction
Any command-line UNIX interface may be automated.
This article will demonstrate the use of pseudo-terminals, which cause programs to believe they are attached to a terminal, even when they are not!
At first, fooling programs into beleiving they are attached to a terminal may not seem useful, but it is used in a wide variety of software solutions. This programming technique is indespensible in automation and testing fields.
The case of color ls(1)
The command ls -G displays files with colors on OSX and FreeBSD only when standard input is attached to a terminal. When using the subprocess module, we will not see any of these qualities:
import subprocess print(subprocess.check_output(['ls', '-G', '/dev']))
With an explicit -G parameter, the output of this program is still colorless. This quick example shows that some programs behave differently when attached to a terminal.
Interactive
Furthermore, some programs are only interactive when attached to a terminal. The python executable is an example of this. When we run python directly from a terminal, we receive an interactive REPL:
$ python Python 3.5.0 (default, Oct 28 2015, 21:00:27) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.1.76)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> print(4+4) 8 >>> exit()
If we run these commands by piping them to standard input, it will not display such decorators, demonstrated here using the standard shell:
$ printf 'print(2+2)\nexit()' | python 4
And strangely enough, executing Python from Python, using the subprocess module demonstrates the same output:
import subprocess, sys python = subprocess.Popen( sys.executable, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) print(python.communicate(input=b"print(2+2)\nexit()")) (b'4\n', b'')
With a keyboard attached, a terminal may be expected to provide input at any non-determinate future time. Programs such as python test whether any of the standard file descriptors (stdin, stdout, stderr) are attached to a terminal to conditionally offer this behaviour.
We can reproduce this conditional check of isatty(3) easily from shell:
$ python -c 'import sys,os;print(os.isatty(sys.stdin.fileno()))' True $ echo | python -c 'import sys,os;print(os.isatty(sys.stdin.fileno()))' False
As stdin is piped, this fails the test for isatty(3) test.
Cheating isatty(3)
The remainder of this article will focus on tricking isatty(3) into returning True even when the standard descriptors are not actually terminal. This peculiar behavior begins by a call to the standard python pty.fork function. This behaves exactly as os.fork, except that a pseudo terminal (pty(4)) is wedged between the child and parent process.
Why is this useful? Let's examine some programs that make use of pty(4) and fork(2) to explain for themselves:
- tmux(1) and screen(1) make use of pty(4) to perform their magic: the real terminal may leave (detach), while the child continues to believe it is connected with a terminal.
- script(1) records interactive sessions, ensuring all terminal sequences are written to file typescript for analysis.
- ttyrec(1) records sessions like script(1), but with timing information. This is the driving technology behind https://asciinema.org/ for example.
- IPython notebook executes programs through a pty(4) for color output.
- Travis CI uses a pty(4) so test runners produce colorized output.
Finally, the traditional Unix expect(1) by Don Libes uses a pty(4) to allow "programmed dialogue with interactive programs". The remainder of this article will use pexpect: a variant of expect(1) authored by Noah Spurrier
The rainmaker
The telnet host rainmaker.wunderground.com offers weather reports and other various data by major U.S. Airport codes. We can use telnet(1) and summarize our session as follows:
- send return
- send sjc (airport code) and return
- send return
- send X and return
Using pipes, we could script this using only timed input: we must provide sufficient time to elapse for the appearance of each prompt:
(sleep 2 echo sleep 1 echo sjc sleep 1 echo sleep 1 echo X ) | telnet rainmaker.wunderground.com
By using pexpect to wait for a prompt before sending our input, we see a markable improvement in efficiency and fault tolerance. Our script would then read as follows:
import pexpect def main(airport_code): output = '' telnet = pexpect.spawn('telnet rainmaker.wunderground.com', encoding='latin1', timeout=4) telnet.expect('Press Return to continue:') telnet.sendline('') telnet.expect('enter 3 letter forecast city code') telnet.sendline(airport_code) while telnet.expect(['X to exit:', 'Press Return for menu:', 'Selection:']) != 2: output += telnet.before telnet.sendline('') output += telnet.before telnet.sendline('X') telnet.expect(pexpect.EOF) telnet.close() print(output.strip()) if __name__ == '__main__': import sys main(airport_code=sys.argv[1])
Closing thoughts
A REPL is a particularly interesting target. The SageMath project uses pexpect to bundle a great variety of math software by driving the REPL interface of a variety of mathematics programs, bypassing the need to link with software of other programming languages.
Software and language suites providing a shell or REPL may be functionally tested using pexpect, and this is where the library serves its purpose best. We can now write automated tests for the python interactive shell, for example.
In many industries where technology systems migrate slowly, it may become very useful to automate commercial or blackbox software systems that provide only a shell, such as mainframes or embedded control devices. With the technique of terminal automation, we may now provide a sensible REST API to such legacy systems!