Next Previous Contents

1. #!

1.1 Introduction

Calling a function of the exec family causes a file to be executed as a program. There are two or three separate cases.

If the file is recognized as being in some well-known binary format, the corresponding loaders and linkers are invoked, and an executable process image is constructed, and subsequently started.

If the file starts with the two bytes #!, then the program is started whose name is found following the #!, with the name of the current file as an argument. This enables one to use interpreted scripts on equal footing with compiled programs.

If none of these is the case then some silly systems feed the file to /bin/sh. This usually leads to screenfuls of error messages and users praying that in this megabyte of binary garbage no valid shell commands occur. Quite often the terminal is left in some obscure state.

1.2 Generic example

On most Unix-like systems, given a script

#! /path/interpreter -a -b
stuff
with pathname /scriptpath/script, the invocation
        execl("/scriptpath/script", "args0", "args1", "args2", (char *) 0);
is roughly equivalent to an invocation
        execl("/path/interpreter", "/path/interpreter", "-a -b",
              "/scriptpath/script", "args1", "args2", (char *) 0);

1.3 The parameters of the interpreter invocation

The interpreter is called with a parameter list consisting of four groups of arguments: arg0, argi, argn, args.

The first group, arg0, consists of one argument. For SysVR4, SunOS, Solaris, IRIX, BSDI, BSD-OS, OpenBSD, DU, Unixware, Linux 2.4, FreeBSD this argument is /path/interpreter. For Tru64 (4.0), AIX (4.3, 5.1), Linux 2.2, MacOS X this argument is interpreter. For HP-UX this argument is /scriptpath/script.

The second group, argi, consists of the 0 or 1 or perhaps more arguments to the interpreter found in the #! line. Thus, this group is empty if there is no nonblank text following the interpreter name in the #! line. If there is such nonblank text then for SysVR4, SunOS, Solaris, IRIX, HPUX, AIX, Unixware, Linux, OpenBSD, NetBSD, Tru64, FreeBSD 6.0 this group consists of precisely one argument, as in the example above where argi consists of the single argument "-a -b". FreeBSD before 6.0, BSD/OS, BSDI, Minix split the text following the interpreter name into zero or more arguments, and hence have an argi consisting of the two arguments "-a", "-b" in the above example. (FreeBSD 4.0 introduced and FreeBSD 6.0 removed the treatment of '#' in arguments as comment.) (Also Plan9 allows several arguments here.) Solaris and Unixware split at spaces, but take at most 1 argument, thus have argi equal to "-a" in the above example.

The third group, argn, consists of one argument, the name of the script. According to the man pages, SysVR4 and Unixware have args0 here, but tests on Unixware 7.0.1 also show the name of the script.

The fourth group, args, consists of zero or more arguments, namely the arguments after args0 in the invocation of the script.

1.4 The #! line

The syntax is as follows: #! followed by optional whitespace followed by the interpreter pathname optionally followed by (whitespace followed by interpreter arguments) followed by newline.

Some systems have a limit on the line length. For example, SunOS 4 truncates it after 32 bytes; HP-UX truncates it after 80 bytes; Linux truncates it after 127 bytes; AIX 5.1 truncates it after 255 bytes. BSD/OS 4.2 does not truncate. At least 128 bytes are accepted on Solaris, IRIX, AIX, OSF/1.

(AIX 4.3 refuses to execute the script if the total length of the #! line exceeds 255 bytes (and then has an exit status of 0 ...). FreeBSD (3.4, 4.2): As AIX 4.3, but with 64 bytes. FreeBSD (5.0): Fails with ENAMETOOLONG if the total length of the #! line exceeds 128 bytes.)

Many systems delete trailing whitespace from the #! line before parsing. (Linux, FreeBSD, BSD/OS, HP-UX, Solaris, SCO OpenServer 5, Unixware 7 delete trailing whitespace. NetBSD, OpenBSD, Tru64, IRIX, AIX leave trailing whitespace in the single argument. AIX (4.3, 5.1) and Tru64 (T4.0) convert a trailing tab to a space.)

It is rumored that some systems only recognize an executable script when it starts with the four bytes `#! /', probably because the GNU autoconf manual says so, but it seems impossible to find confirmation for this rumor. See also this article by Sven Mascheck.

1.5 Remarks

Clearly, the syntax as sketched precludes an interpreter pathname with embedded whitespace. I do not know about systems that define an escape mechanism here.

Many Unix systems will consider a `\r' in a DOS-type line-ending (`\r\n') part of the interpreter arguments. (HP-UX, Solaris, UnixWare ignore trailing `\r'. AIX, BSD/OS, FreeBSD, Linux, OpenBSD, Tru64 do not.)

The details on arg0 and argn vary in cases where a relative pathname is used. Usually argn is precisely the string that was used, but some systems make it "./script" in case "script" was used. Unixware 7 uses /dev/fd/N in case the script is suid and not invoked by root (in order to avoid the race between reading the name and opening the script). The Hurd uses /dev/fd/N in case no pathname of the script is known (for example, because it was execd using fexec()). Usually arg0 is precisely the string that was used, but some systems turn "/path/interpreter" into just "interpreter", while some other systems do just the reverse.

Some systems forbid interpreters that themselves are interpreted scripts.

Additions and corrections are welcome.

Andries Brouwer - aeb@cwi.nl

1.6 Testing

Consider the script

#! bar -a -b
date
named foo, and the C program bar.c
int main(int argc, char **argv) {
        int i;

        for (i=0; i<argc; i++)
                printf(" [%s]", argv[i]);
        printf("\n");
        return 0;
}
compiled as bar. Do chmod +x foo. Now look at the output of the commands uname -a and foo x y z. Do the same with full pathnames.

Linux 2.0.34: [bar] [-a -b] [./foo] [x] [y] [z] Linux 2.3.46: [bar] [-a -b] [./foo] [x] [y] [z]

1.7 Acknowledgements

A lot of useful information was provided by Colin Watson.


Next Previous Contents