Due: Tuesday, April, 30th, 2002
Read Ken Thompson's article ...
char s[] = { '\t', '0', '\n', '}', ';", '\n', '\n', '/', '*', '\n', (213 lines deleted) 0 }; /* The string s is a * representation of the body * of this program from '0' * to the end. */ main(){ int i; printf("char\ts{ } = {\n"); for(i=0; s[i]; i++) printf("\t%d,\n",s[i]); printf("%s",s); }
Compile and run this program. You will have to figure out what goes in the 213 deleted lines. If you want, you can leave the comment out of the string s and out of the program. Explain the output. By itself, this program does not pose any sort of security threat. Explain in your own words what this program has to do with Thompson's Trojan Horse attack.
A vulnerability in the Bourne Shell, discovered in late 2000, allowed arbitrary writing to files. The problem involves insecure creation of files in the /tmp directory. When using redirection, files are created in the /tmp directory without first checking for existence of the file. This enables a symbolic link attack that may corrupt any file that is writable by the owner of the redirecting shell. Here is some example code illustrating the problem. (Redirection <<word causes a command's standard input to come from.standard input, until word appears on a line by itself.)
/tmp# echo 'hello world' > rootfile /tmp# chmod 600 rootfile /tmp# ln -s rootfile sh$$ /tmp# chown -h 666.666 sh$$ /tmp# ls -l rootfile sh$$ -rw------- 1 root root 12 Oct 29 03:55 rootfile lrwxrwxrwx 1 666 666 8 Oct 29 03:56 sh12660 -> rootfile /tmp# cat <<BAR ? FOO ? BAR FOO o world /tmp# ls -l rootfile sh$$ /bin/ls: sh12660: No such file or directory -rw------- 1 root root 12 Oct 29 03:56 rootfile /tmp# cat rootfile FOO o world /tmp#
Explain the contents and relation between the two filenames used before the cat command.
Explain what cat << BAR seems to be doing and say why this is a problem.
Perl is a programming language that was designed for scanning text files and extracting information from them. Perl is often used to write CGI scripts, which may run in privileged mode. For this reason, Perl programmers may be concerned that tricky text input might cause their program to make undesirable system calls.
The Perl implementation performs a set of security checks when it is run in taint mode. Taint checks are designed to make sure that arguments to system calls are not controlled by user input in certain ways. More specifically, command line arguments, environment variables, results of certain system calls, and all file input are marked as ``tainted''. Tainted data may not be used directly or indirectly in any command that invokes a sub-shell, nor in any command that modifies files, directories, or processes. For example, if system(e) causes Perl to pass the string argument e to a shell for parsing and execution, then e must not be tainted.
In general, any variable set to a value derived from tainted data will be considered tainted. However, untainted information can be extracted from a tainted variable using string matching operations. Intuitively, the reason for this is that the programmer is assumed to use string-matching patterns that will avoid security problems. For example, if a Perl command will write to a file, then string matching can be used to make sure that only characters appear in the user-supplied filename. This allows the programmer to protect against embedded shell commands and other security problems that a malicious user might try to place inside a file name.
This problem asks some simple questions about tainting for an example expression language (not Perl, in case you don't know Perl). The expressions are given by the grammar
e ::= var | cst | concat}(e,e) | match}(e,e) var ::= x | y | z | ... cst ::= "symbols" symbols ::= e | a symbols | b symbols | c symbols | ...
In words, an expression is a variable, constant, or expression formed using one of the two string functions,
concat or match. A variable is a single letter like x, y, or z, and a constant is a sequence of symbols enclosed in double quotes.
A string of symbols is either empty (indicated by e in
this grammar) or a symbol followed by a string of symbols.
While only letters a, b, c, ... are listed here, assume that strings can also contain other symbols such as ``.''
and ``*''.
The value of an expression is always a string, but it may be the empty string. The value of concat(e_1,e_2) is the concatenation of the two strings, and the value of match(e_1,e_2) is the substring of e_2 that matches the pattern e_1. When string e_1 is regarded as a pattern, letters match themselves, and other symbols may have special meanings. In particular, "." matches any character and "*" means zero or more occurrences, so "a*" matches any string consisting whose only symbol is a.
The tainting algorithm is an alternative form of evaluation that gives each expression the value taint or untaint.
A variable may be tainted or untainted (depending on how it was set), but a constant is always untainted. The reason
for considering every constant untainted s that a constant is written as part of the program,
and therefore does not come from user input. Since Perl tainting assumes that the programmer writes programs carefully, string
constants written by the programmer are considered untainted. A concatenation involving a tainted
string is tainted, but matching an untainted pattern against any expression gives an untainted value.