Homework 1

CS155, Spring 2002

Due: Tuesday, April, 30th, 2002


Problem 1

Read Ken Thompson's article ...

char s[] = {
	'\t',
	'0',
	'\n',
	'}',
	';",
	'\n',
	'\n',
	'/',
	'*',
	'\n',
	(213 lines deleted)
	0
};

/* The string s is a 
* representation of the body
* of this program from '0'
* to the end.
*/

main(){
   int i;
   printf("char\ts{ } = {\n");
   for(i=0; s[i]; i++)
      printf("\t%d,\n",s[i]);
   printf("%s",s);
}

Compile and run this program. You will have to figure out what goes in the 213 deleted lines. If you want, you can leave the comment out of the string s and out of the program. Explain the output. By itself, this program does not pose any sort of security threat. Explain in your own words what this program has to do with Thompson's Trojan Horse attack. 


Problem 2

A vulnerability in the Bourne Shell, discovered in late 2000, allowed arbitrary writing to files. The problem involves  insecure creation of files in the /tmp directory. When using redirection, files are created in the /tmp directory without first checking for existence of the file. This enables a symbolic link attack that may corrupt any file that is writable by the owner of the redirecting shell. Here is some example code illustrating the problem. (Redirection <<word causes a command's standard input to come from.standard input,  until word appears on a line by itself.)

/tmp# echo 'hello world' > rootfile
/tmp# chmod 600 rootfile
/tmp# ln -s rootfile sh$$
/tmp# chown -h 666.666 sh$$
/tmp# ls -l rootfile sh$$
-rw------- 1 root root 12 Oct 29 03:55 rootfile
lrwxrwxrwx 1 666 666 8 Oct 29 03:56 sh12660 -> rootfile
/tmp# cat <<BAR
? FOO
? BAR
FOO
o world
/tmp# ls -l rootfile sh$$
/bin/ls: sh12660: No such file or directory
-rw------- 1 root root 12 Oct 29 03:56 rootfile
/tmp# cat rootfile
FOO
o world
/tmp#

Explain the contents and relation between the two filenames used before the cat command.

Explain what cat << BAR seems to be doing and say why this is a problem.


Problem 3

Perl is a programming language that was designed for scanning text files and extracting information from them. Perl is often used to write CGI scripts, which may run in privileged mode. For this reason, Perl programmers may be concerned that tricky text input might cause their program to make undesirable system calls.

The Perl implementation performs a set of security checks when it is run in taint mode. Taint checks are designed to make sure that arguments to system calls are not controlled by user input in certain ways. More specifically, command line arguments, environment variables, results of certain system calls, and all file input are marked as ``tainted''. Tainted data may not be used directly or indirectly in any command that invokes a sub-shell, nor in any command that modifies files, directories, or processes. For example, if   system(e)  causes Perl to pass the string argument e to a shell for parsing and execution, then e must not be tainted.

In general, any variable set to a value derived from tainted data will be considered tainted. However, untainted information can be extracted from a tainted variable using string matching operations. Intuitively, the reason for this is that the programmer is assumed to use string-matching patterns that will avoid security problems. For example, if a Perl command will write to a file, then string matching can be used to make sure that only characters appear in the user-supplied filename. This allows the programmer to protect against embedded shell commands and other security problems that a malicious user might try to place inside a file name.

This problem asks some simple questions about tainting for an example expression language (not Perl, in case you don't know Perl). The expressions are given by the grammar

           e ::= var | cst | concat}(e,e) |  match}(e,e)
         var ::=  x |  y | z | ...
        cst  ::= "symbols"
     symbols ::=  e |  a symbols |  b symbols  |  c symbols |  ...

In words, an expression is a variable, constant, or expression formed using one of the two string functions, concat or match. A variable is a single letter like x, y, or z, and a constant is a sequence of symbols enclosed in double quotes. A string of symbols is either empty (indicated by e in this grammar) or a symbol followed by a string of symbols.
While only letters a, b, c, ... are listed here, assume that strings can also contain other symbols such as ``.'' and ``*''.

The value of an expression is always a string, but it may be the empty string. The value of concat(e_1,e_2) is the concatenation of the two strings, and the value of match(e_1,e_2) is the substring of e_2 that matches the pattern e_1. When string e_1 is regarded as a pattern, letters match themselves, and other symbols may have special meanings. In particular, "." matches any character and "*" means zero or more occurrences, so "a*" matches any string consisting whose only symbol is a.

The tainting algorithm is an  alternative  form of evaluation that gives each expression the value taint or untaint.

A variable may be tainted or untainted (depending on how it was set), but a constant is always untainted. The reason for considering every constant untainted s that a constant is written as part of the program, and therefore does not come from user input. Since Perl tainting assumes that the programmer writes programs carefully, string constants written by the programmer are considered untainted. A concatenation involving a tainted string is tainted, but matching an untainted pattern against any expression gives an untainted value.

  1. Assume that x and y are tainted and z is untainted. Which of these expressions are tainted: concat(x,y), concat(x,z), match(x,y), match(x,z), match(z,x) ?
  2. A characteristic of Perl tainting is that any expression can be converted to an untainted expression. Write an expression containing e that will have the same string value as e, but will always be untainted, regardless of whether e is tainted.
  3. Do you think the property of tainting explained in part b is a feature or a bug? Explain briefly.
  4. We can define a more conservative tainting method by distinguishing between constant e_1 and untainted e_1 in match(e_1,e_2).  Suppose we redefine tainting for match to (i) allow a match against a constant pattern c to be untainted if c is not your answer to part b, and (ii) make a match of a non-constant pattern against a tainted string tainted. Can you still convert any tainted string into an untainted one?

Problem 4

Recall that StackGuard defends against buffer overflow by placing a canary at the top of every stack frame. Whenever a function returns the integrity of the canary is verified and the function is allowed to return only if the canary is intact. Give sample C code that is vulnerable to a buffer overflow attack that can modify the return address on the stack without changing the canary.
Hint: think of a buffer overflow that results in a change to a pointer located in the current stack frame.

Problem 5

Recall that modern processors have an on chip data cache (called an L2 cache) used to reduce the number of accesses to main memory. Reading from the L2 cache is about 10 times faster than reading from main memory. Explain how the L2 processor cache can result in a covert channel on the local system. Give a sample short (pseudo) code to exploit this channel.
Hint: use memory access timing information.