SHEBANG!! (or #!)

Today I was searching for a way to make more portable a python script I wrote. That was after I put a script of the (so said) incredibly portable python language from FreeBSD to Ubuntu, only to find out it didn't work. A little search, testing and swearing, I found the root of the problem: I was using the wrong path of the python interpreter.

Like most of you I hope already know, to execute a python program (or perl, bash, or any other interpreted programming language) as a command you have to put this code as first line of the file (in this example I'm using the most used bash):

#!/bin/bash

Wait a moment, I've always been using this without really understanding. Today I'm gonna study a little bit.
#! is called Shebang (or, rarely, hashbang) and it's a way to specify the interpreter of a program. After this couple of symbols, you have to write the full path to the interpreter. The best part, you can pass arguments to the interpreter directly in the program's first line! Like this, to show perl warnings (I copied it, I don't know how perl works):

#!/usr/bin/perl -w

Yeah, that's cool! Then I can write a python script and I can execute it like a command just by writing at the beginning #!/usr/bin/python

Well...yes, if you are in an Ubuntu environment. Unluckily, that's not the same in all OSs. Damn, it could be different in the same OS for different users, depending on where it was installed! And another problem could be the interpreter version: a python 2.7 script probably won't run with the 3.3 interpreter. What's worse (well, for my poor script at least), the two versions probably will be called the same.

So, as you can see, by default a script isn't really portable even with a shebang command as first line. If you're lucky, maybe it could work on some OSs of the same (or similar) distro and not too distant version. Well, at least at the first try. So, what's the solution?

There's no real solution. It all comes down to manage well your systems. In my naive opinion, one good starting point is to use as much as possible the same OS for all the machine. Another is to limit the number of OSs used. A third is to keep as much as possible the same version of the interpreters on every server/client.
That done, there is a trick with the shebang that increases the portability of your (my) script/program. It involves the unix command env.

This command can print a list of environment variables, or executing commands in an altered environments without modifying the current one. It sort of emulates a new set of environmental variables for the called program that overrides the actual one, but I'm not so sure, I didn't find much documentation.
The trick consists in calling the absolute path of env (the one specified in the example!!) and then the name of the interpreter. In this way:

#!/usr/bin/env <INTERPRETERNAME>

What is the advantage? Well, anywhere most used interpreters (python, perl, bash, sh, etc.) are installed, their position will be in within the environments variables. If it isn't, you can put a symlink in the environment path of the interpreter, which is much simpler than to edit every script's first line.
WARNING: in this way, you cannot pass arguments to the real interpreter. Or, to better say, they won't be read. If you pass perl -w it will only use perl without arguments.

But...what if env isn't in the absolute path indicated? After all, if the interpreter can be in a different directory in different distros, why can't be the same for env?

You're absolutely right. Some corner distros have env in some other positions. But, all the main distros and derived (Debian, Fedora, CentOS, FreeBSD, RHEL, Solaris, AIX, SUSE, Gentoo, etc.) have env or a symlink in that position which points to the right location. That means the great majority of the linux environments. If you join this with a really portable language (say python or perl for example) and the same version of the interpreter almost everywhere, you can really use with almost no effort your scripts on every machine you manage.

So, I personally prefere to use the env command, and hope to remember this article the first time it won't work to easily find a solution. But you should evaluate your situation to use a good strategy.

REFERENCES
- Wiki Shebang - Link to the wiki page- env Wiki page - Link to the wiki env page
- The #! magic, details about the shebang/hash-bang mechanism on various Unix flavours - an old but complete arcticle about the shebang
- Stackexchange - Why is it better to use “#!/usr/bin/env NAME” instead of “#!/path/to/NAME” as my shebang? - An interesting dicussion about the opportunity to use env or not
- How does /usr/bin/env know which program to use? - Another interesting discussion

Labels: bash