Tuesday, 29 November 2022

Setting up Rainloop webmail on Linux

I recently discovered that Fastmail does support offline mail reading, without a $60 extra fee!

I have Dovecot/Postfix setup so I decided to setup rainloop to provide webmail.

Rainloop is written in php and uses IMAP as a back end (it requires no database) and does not have to run on the same host as the IMAP server.

Setup instructions on https://www.rainloop.net/ are woeful, so I figured I'd contribute some better docs to my future-self.

Debian package requirements

apt-get -y install wget php apache2 php7.4-curl php7.4-xml

Install latest version from repository.rainloop.net

mkdir -p /var/www/rainloop
cd /var/www/rainloop
wget -qO- https://repository.rainloop.net/installer.php | php 
chown -R www-data:www-data .
find . -type d -exec chmod 755 {} \;
find . -type f -exec chmod 644 {} \;

This gives you bunch of scripts in /var/www/rainloop/rainloop/v/1.12.0/ and a data directory /var/www/rainloop/data. Apache/Nginx setup needs care not to expose /var/www/rainloop/data.
The most basic apache config is a follows.

	ServerName mail.myco.org
	<VirtualHost *:80>
		ServerAdmin webmaster@myco.org
		DocumentRoot /var/www/rainloop
		ErrorLog /var/log/apache2/error.log
		CustomLog /var/log/apache2/access.log combined
	</VirtualHost>

If you are using Debian, create this text file in /etc/apache2/sites-available/rainloop.conf and symlink to /etc/apache2/sites-enabled/rainloop.conf
Given that /data is shared by default you should probably remove directory indexing

rm /etc/apache2/mods-enabled/{autoindex.conf,autoindex.load}

and do make sure that .htaccess is observed.

The ./data directory is setup when you first access the rainloop admin GUI on http://myhost/?admin default user and password is admin / 12345.

Configuring the IMAP server is as simple as creating an .ini file in /var/www/rainloop/data/_data_/_default_/domains/

e.g. domains/myco.ini

imap_host = "mail.myco.org"
imap_port = 993
imap_secure = "SSL"
imap_short_login = On
smtp_host = "mail.myco.org"
smtp_port = 25
smtp_secure = "TLS"
smtp_short_login = On
smtp_auth = On
smtp_php_mail = Off
white_list = ""

A small amount of configuration is needed, the config values are generally self-explanatory.

Rainloop administrator GUI defaults to open on the Intenet with a well known username and password!

Change the admin user with

app_ini=$root_fs/var/www/rainloop/data/_data_/_default_/configs/application.ini
   
sed -i -e 's/^admin_login = *$/admin_login = "xxxadmin"/' $app_ini

Potentially change the URL too

sed -i -e 's/^admin_panel_key = *$/admin_panel_key = "xxxadmin"/' $app_ini

If you dont need the admin UI, the whole thing can be disabled which is probably safest.

sed -i -e 's/^allow_admin_panel = .*$/allow_admin_panel = Off/' $app_ini

All that remains is to setup HTTPS in Apache. I have Nginx handling TLS and proxy_pass to Apache.

/var/www/rainloop/data/_data_/_default_/configs/application.ini has other options to play with, including options for company branding and themes.

Rainloop has a mobile friendly UI but does not give you off-line mail reading, I'm using K-9 Mail from F-Droid.

In summary: Dovecot + Postfix + Rainloop gives me a completly free very modern and feature rich email stack.

Sunday, 5 April 2020

Removing netplan from Ubuntu

For the last year or more I have been confused by a problem where LXC containers get two IP addresses assigned.

Naturally you want static IPs for server containers defined in lxc/$name/config

lxc.net.0.ipv4.address = 10.0.3.2/24
lxc.net.0.ipv4.gateway = 10.0.3.1

would get two ip addresses on boot, the statically assigned one by lxc and another randomly assigned one.

I never really understood the root cause.

I fixed this by disabling files here and there, removing ip config from lxc/$name/config, and then and in /etc/rc.local running ip addr add 10.0.3.2/24 scope global dev eth0
ip route add default via 10.0.3.1 dev eth0

Some of my containers use Ubuntu, most use lxinitd, I did not notice that the problem was limited to Ubuntu.

The above solution requires installing iproute2 i.e. /sbin/ip in containers, which in tern, requires mounting /lib lib64 /usr/lib and not being able to run a fully statically compiled container.

This fella has a much simpler solution

https://www.claudiokuenzler.com/blog/938/lxc-container-not-getting-ip-address-netplan

Netplan is root cause of the problem, and it can be removed.

apt-get remove netplan.io

Now setting the IP address in lxc/$name/config works across all my containers.

Sunday, 23 February 2020

Two factor auth with bash

I have a couple of servers that I ssh into from different locations, I don't always have my ssh keys. I have come up with what I think is a fairly secure 2fa using ssh and bash. RSVP if you see a flaw in this.

Server's /home/myuser/.profile has
if [ "$PASSWORD" != longanddifficulttotypepassword ]
then
  exit 0
fi
Server's /etc/ssh/sshd_config forces use of a bash login shell
Match User myuser
    ForceCommand /bin/bash -l
and allows sending environment variables
AcceptEnv LANG LC_* PASSWORD
When I login I supply PASSWORD as an environment variable, e.g. in ~/.ssh/config containing...
Host home
    User myuser
    SetEnv PASSWORD=longanddifficulttotypepassword
As long as I can remember my long and short passwords when I travel, I can login without SSH keys.

All I have to type to login is ssh home and my short password.

I'd like to build this feature into PAM but I've not yet found a way to pass a second token without user input.

If a hacker knows you are doing these things, the security weakens but it does not disappear. Telling everybody that your logins support !@#$%^&*()_+ in usernames makes dictionary attacks harder, even if you don't use those characters. Of course simple 2fa should not replace a functioning first layer.
  • Don't use very weak passwords.
  • PermitRootLogin prohibit-password
  • AllowUsers ...
  • Don't run ssh on port 22.
In your local /etc/ssh/ssh_config you probably want to ensure you dont accidentally send an env variable to a server that isn't your own
Host onlymyserver
    SendEnv LANG LC_* PASSWORD

Tuesday, 14 January 2020

Changing Ubuntu's royale purple

It seems like Ubuntu uses a lot of purple, but it turns out it only in 5 places.
  • grub - i.e. the boot screens
  • plymouth
  • terminal background 
  • desktop background
  • login greeter background
Presuming you want to change the background to something dark, i.e.  you don't want to invert light and dark tones, the job is pretty easy.

Grub

Create a background image match the size to native screen resolution. Save it to /boot/grub/background.png

Edit /etc/default/grub adding

GRUB_BACKGROUND="/boot/grub/background.png"

Run sudo update-grub

Plymouth

Plymouth is the screen that shows after grub while the laptop loads.  I remove quiet from grub boot lines so I get to see the steps of the Linux boot process. I have nothing to change here, if you prefer the Ubuntu logo and loading dots its configurable and easy to test changes https://wiki.ubuntu.com/Plymouth

Terminal

Edit > Preferences > profile > Color (tab)
uncheck "use colors from the system theme"
Change only the background. Set something below #303030

Login Screen

The theme for gdm is written in CSS, files in /usr/share/gnome-shell/theme/, you may have more than one option.

Find existing backgrounds

cd /usr/share/gnome-shell/theme/
grep -r -A4 lockDialogGroup .


The #lockDialogGroup elements define the background.

To just make it black

echo '#lockDialogGroup { background: #000; }' > ubuntu.css

Desktop background

Right mouse click on the desktop and "change background".



Et voila,  No purple.

I imagine this is subject to change over time, I have done the same steps on 18.04 and 20.04. Despite the change in Ubuntu its not lost its twiddleability.
 

Saturday, 11 January 2020

Ubuntu Linux on the Asus ZenBook 13"

Ubuntu Linux on the Asus ZenBook 13"

A.K.A Zen flipbook
A.K.A UX362F
 
Short story is Ubuntu works fine.

Install Process

A couple of tricks.

If you turned on the laptop you need to properly shutdown, power button is a soft key. Hold down the off button for ages (like 10 or 20 seconds) to properly hard reset.

The screen should go blank and then start hitting F2 repeatedly.
The ASUS logo shows for a what seems like a long time, keep hitting F2, finally you enter the BIOS.

Insert the pendrive with Ubuntu on (USB Disk Creator if you dont have one) I was using 18.04 LTS but I plan to upgrade to 19.10 since the UI is good looking.

Disable Secure Boot

Delete the Windows Boot Option  (fear not it can be recreated)

Add a new boot Option e.g. called usblinux and find the efi file   <efi> <boot> bootx64.efi

Save & Exit

The system should boot to a terminal UI select * Try Ubuntu

Ubuntu should load and show you the familiar fancy dekstop, with the Install Ubuntu Icon.  Using Try and Install, saves you rebooting and BIOS nonsense if the install fails.

Attempt 1

First time I tried to install Ubuntu into the C: drive space (the largest partition) leaving the repair partitions, something I have done with other laptops. This initially worked and booted successfully but somehow during  a subsequent boot the BIOS found Windows, Windows found Linux and before I could stop it, it "repaired" my system by removing the Linux boot option and reinstalling the Windows boot option.  Naturally windows would not boot since I'd wiped C: data.

Arguably this is virus behaviour!

Attempt 2

Second time wiped the main disks partition table, and created a new partition table.  Adios Windows and its viral "repair" features.

I needed to re-create an efi partition, this is important or install will fail at the grub stage.  ref:.  Creating EFI partition is hidden in the "something else" option, for manually creating partitions.

I created a 200Mb EFI system partition, 8Gb /swap, and the rest etx4 /.

<rant>So much Junk gets put in $HOME these days its not worth separating out that partition.  Particularly I code rust and rustup installs the whole tool-chain and all libs inside $HOME. The days of having user data in $HOME, it seems, are over.
I create a /data directory and put all my user data there, where helpful automated tools can not find it :) </rant>

Peripherals

Ethernet

This laptop has a decent amount of connectivity options but no RJ45, so a dongle of some sort is needed.  The left side has a regular USB3 port and cheapo  Gigabit Ethernet dongle inserted there works fine, no config needed.  Plug it in and the network auto-configures.
On the other side are 2 x USB-C ports. I consider USB-C too small and flimsy for everyday use but I did try a dongle with Gigabit Ethernet HDMI and 2 x USB3 ports.  Everything auto-configured with no effort.  USB peripherals connected to the USB-C dongle also auto-configure.

Wifi


Works.

Audio

Works. <rant>Harmon Kardon pretty much pwn the entire audio industry these days, they are owned by Samsung and they have a great many sub-brands.  Harmon Kardon branded stuff is a pretty good guarantee of quality and certainly no gripes on the audio output. N.B. real mini jack with a proper head phone amp, no noise and good quality sound. </rant>

Keyb


Keybord function keys all work except the on/off for the back-lighting in 18.04, fixed in later versions.  Back-lighting works. You can run the preinstalled script /etc/acpi/asus-keyboard-backlight.sh instead of the button.
Layout in Europe is US style small Enter key which I much prefer, one of the reasons I accepted a Spanish layout.  If you can touch type on a US keyb this is a good option when purchasing on the continent.

Mouse

Works. The mouse pad is painted with a keyboard which does not work. Not investigated that yet

Display

<rant>Have to remove an annoying sticker on the screen without damaging the screen.  What you might call a bad un-boxing experience.  I remove the other little sticker on the from panel too, except Karmon Hardon is printed on.<rant>

Display is sexy: 1920 x 1080 and touch screen works.  I've not owned a touch screen laptop before but from familiarity with phone I have found myself poking at laptop screens before so I suspect it will become a much loved feature.

On screen keyboard automatically pops up when needed like on a phone, but is a bit patchy, does not work in FireFox for example.  Apparently there are some tool to install to make table mode more comfortable.


<rant> N.B. this Laptop has Intel Graphics so no Optimus woes, I believe those things are fixed in newer Ubuntus (with no little from NVidia itself) once bitten twice shy. I will not buy NVidia hard ware if I can avoid it.<rant>




All in quite happy.  I recommend the ASUS ZenBook UX362F if you are looking for a powerful ultra-portable Linux Laptop, Intel i7, 16GB of RAM, should be enough for everyone.


Tuesday, 12 June 2018

Installing Ubuntu without systemd

WARNING don't copy commands on this page blindly, you can easily end up messing your install. e.g. rm /sbin/init in the host system and not the container rootfs would not be nice.

The aim of this is to create and Ubuntu build server with libc, python, nodejs, build-essentials,  all at exactly the same version as Ubuntu but without systemd, using apt to install everything.

My goal is to run LXC containers with as little RAM overhead as possible.  The container host I'm running on has 500MB of RAM, I'm quite tight with resources.

TL;DR
  1. create lxinitd based lxc container
  2. run the ubuntu lxc template script
  3. revert lxinitd's /sbin/init and /etc/rc.local
  4. add core services to /etc/rc.local replacing systemd inits

Step 1 Create the lxinitd based server.

 

Install lxinitd.

Create a container called "ci".

lxc-create -t lxinitd -n ci -f /etc/lxc/default.conf

At this point I then added some mounts, and I remove the /lib /lib64 and /usr/lib mounts defined in /var/lib/lxc/ci/config
In this case I want the container to have its own copy of all libs installed via .debs, by default, lxinitd mount libs from the container host. 

I then eddited lxinitd's boot script $rootfs/etc/rc.local, you can have lxc do the networking but I prefer to do it myself with ip, I'm not using Ubuntu/Debian style /etc/network/interfaces or dhcp.

My $rootfs/etc/rc.local looks like this

#!/bin/lxinitd
respawn /sbin/getty -L tty1 115200 vt100
/sbin/ip addr add $ip_address/24 dev eth0
/sbin/ip route add default via 10.0.3.1


You may want to do something different with your base lxc container.

Step 2 Overlay the Ubuntu container bootstrap


First backup $rootfs/etc/rc.local since debootstrapping will overwrite it.
Remove the $rootfs/sbin/init symlink to lxinitd.

mv $rootfs/etc/rc.local $rootfs/etc/rc.local.tmp
rm $rootfs/sbin/init


Take a copy of /usr/share/lxc/templates/lxc-ubuntu so it can be edited an run independently of lxc.

The main change to make to this script is to permanently disable running systemd services when apt installs applications.
The default lxc-ubuntu template already disables running systemd servers, but it enables it again for the final configuration.

echo exit 101 > $rootfs/usr/sbin/policy-rc.d
chmod +x $rootfs/usr/sbin/policy-rc.d


By leaving and executable $rootfs/usr/sbin/policy-rc.d in the filesystem apt knows not to try to run commands during installations, this is necessary when you are installing in a chroot.

Edit the parts of the lxc-ubuntu script which remove policy-rc.d.

With policy-rc.d changes made, run the script as follows, add any of the packages you need in the base system. I'm adding sshd here but you don't need todo that.

./lxc-ubuntu.sh --name ci --rootfs /var/lib/lxc/ci/rootfs --path /var/lib/lxc/ci --packages openssh-server

Step 3 Revert the lxinitd init process


The next step is to restore lxinit's init, first remove /sbin/init (which will be a symlink to /bin/systemd), symlink /sbin/init to /bin/lxinitd, then copy back our rc.local.

cd $rootfs/sbin
rm init
ln -s ../bin/lxinitd init
mv $rootfs/etc/rc.local.tmp $rootfs/etc/rc.local


Step 4 Configure systemd services replacements

 

To be honest I thought this stage would be painful, perhaps impossible.  But it was very easy.

systemd peeps decided to own DNS resolving, which is arguably not the remit of an init system, this container does not have systemd-resolved running.
The fix is trivial, just put a real DNS server in /etc/resolv.conf, this one is the OpenDNS public server.

echo 'nameserver 208.67.222.222' > etc/resolv.conf

Obviously if you have a local DNS or dnsmasq running on the container host or local network, use that.

Ensure that /etc/hostname and /etc/hosts and lxc.utsname report the same string for the local hostname.

Configure syslog, crond and sshd, by adding a couple of lines to /etc/rc.local.

mkdir -p /var/run/sshd

service /var/run/crond.pid /usr/sbin/cron -f
service /var/run/rsyslogd.pid /usr/sbin/rsyslogd
service /var/run/sshd.pid /usr/sbin/sshd -D


I really like the fact that setting up services in lxinitd is a one-liner.

And that was it. Nothing more was needed to replace the init system entirely.  systemd is installed and all its many libs are there on the filesystem but they don't boot.

lxc-start -n ci

runs lxinitd and boots services in the container and systemd does not get a look-in.

The ps list of this container looks like this.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0    208     4 ?        Ss   19:42   0:00 /sbin/init
root        28  0.0  0.0  26072  2456 ?        S    19:42   0:00 /usr/sbin/cron -f
root        30  0.0  0.0  65516  5580 ?        S    19:42   0:00 /usr/sbin/sshd -D
syslog      31  0.0  0.0 256400  2752 ?        Ssl  19:42   0:00 /usr/sbin/rsyslogd


i.e. all things that I'minterested inand nothing else.

Equivalent Ubuntu base container ps list looks like this.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.7 154448  3752 ?        Ss   May24   0:32 /sbin/init
root        38  0.0  0.8  79480  4420 ?        Ss   May24   0:09 /lib/systemd/systemd-journald
systemd+    42  0.0  0.1  74464   644 ?        Ss   May24   0:07 /lib/systemd/systemd-networkd
message+    66  0.0  0.3  47448  1700 ?        Ss   May24   0:01 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root        69  0.0  0.1  31208   712 ?        Ss   May24   0:04 /usr/sbin/cron -f
systemd+    75  0.0  0.4  65696  2072 ?        Ss   May24   0:08 /lib/systemd/systemd-resolved
root        89  0.0  0.2  72136  1024 ?        Ss   May24   0:00 /usr/sbin/sshd -D
ci          93  0.0  0.1  79980   828 ?        Ss   May24   0:00 /lib/systemd/systemd --user
ci          94  0.0  0.3 103780  1624 ?        S    May24   0:00 (sd-pam)
root        96  0.0  0.0  15876   136 pts/0    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/0 115200,38400,9600 vt220
root        97  0.0  0.0  15876   132 pts/2    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/2 115200,38400,9600 vt220
root        98  0.0  0.0  15876   132 pts/1    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud console 115200,38400,9600 vt220
root        99  0.0  0.0  15876   140 pts/1    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/1 115200,38400,9600 vt220
root       100  0.0  0.0  15876   136 pts/3    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/3 115200,38400,9600 vt220
syslog    3619  0.0  0.1 256536   664 ?        Ssl  May24   0:01 /usr/sbin/rsyslogd -n
root     11670  0.0  0.4  65796  2216 ?        Ss   May25   0:03 /lib/systemd/systemd-logind



I dont have dbus, systemd-journald, systemd-networkd, systemd-resolved, systemd-logind, & 5 agetty instances.

on the systemd based lxc container I have ~100 systemd units active, on my lxinitd based container I have

linci> sudo systemctl
Failed to connect to bus: No such file or directory


Next step is to have one instance of rsyslog in the container host system, there is no real need to have it running inside each container.

In a container there is often no need to run cron, sshd or rsyslog, in which case, the base container is taking only 4k of RAM (RSS 4), yet this is a full OS Ubuntu and you can

apt install whatever

I add a other services to the base container to support my builds, xtomp, nginx, fcgiwrap, ngircd, ii, tsp.  All of which are installed with

apt install xxx



and a oneliner in /etc/rc.local.

Clearly lots of things in the official repos are going to be broken .  You could not boot Gnome3 off an lxc / lxinitd based container.  I doubt such trickery would work with a RedHat sponsored distro.

For an Ubuntu like buildbox this is perfect.


Monday, 26 March 2018

Writing bash scripts

bash and shell scripting seems to be a problem for many people, including capable programmers in other languages.

This guide is not a list if tips and tricks. Its a description of the syntax and a list of things I wish someone had explained to me before I started.

I've had a fair bit of experience with bash over the years, hacked at the C code added features and extensions.  None of those extensions I will recommend. This post is about writing vanilla bash scripts.

When to use bash


From Google's bash style guide: "Shell should only be used for small utilities or simple wrapper scripts."

There is no loss in starting a util with bash and, if it grows up, porting it to a "proper" language.

That said, it is difficult to have to deal with little utils and applications written in esoteric languages or compiled C that could have been a shell script.

One on the main reasons to use shell is that it is decomposable.

You run it, it works: and vim will tell you how. Copy paste will let you run parts of it.  This is a pretty neat feature and not to be scoffed at.

The main rule for bash code is to keep it simple, that does not mean you cannot achieve great things.  I have written a replacement for Jenkins on a raspberrypi as a few lines of bash and, famously, git was written as shell scripts.

ToC


  • Syntax / whitespace rules
  • How quotes work
  • Why builtins exist
  • pwd

Syntax

Syntax seems deceptively simple, many folk tear their hair out with quotes, $var, and whitespace, because they have not yet assimilated the rules.

A shell script is generally a text file made up of command lines to execute, tokenized by space (seems obvious right) but it is not always.

    myprog arg1 arg2
   
e.g.
   
    mkdir -p target/

N.B. this is split by bash into an array [ 'mkdir' ,'-p', 'target/' ]

There are times when bash's behavior seems erratic but it is quite consistent.


Assignments


The statement

    myvar=foo

is tokenized to a _single_ token [ 'myvar=foo' ] which is a command that bash knows how to interpret, the string foo is assigned to a variable called myvar.

If we change the whitespace

    myvar = foo

is tokenized to [ 'myvar', '=', 'foo'] bash tries to execute a command called myvar passing the parameters = and foo.

If you are lucky myvar does _not_ exist and you get an error, if you are unlucky myvar exists and is executed.

This is tends to be confusing because other languages parse both
a=b
and a = b as 3 tokens.

bash rules are quite simple, but different, the command line is tokenized with whitespace.

[


Take, for example, the following test

    [ -z "$myvar" ] && echo missing variable:     myvar

-z means zero length string, this line checks to see if myvar is empty and if so, it is prints a message  missing variable: myvar.

There are two things that confuse people here...

The following does not work

    [-z $myvar] && echo missing variable:     myvar

[ is a command called [  it _requires_ whitespace to tokenize. 
assuming myvar is missing [-z $myvar]  is tokenized to [ '[-z' ,']' ] i.e. bash looks for a program called  [-z and passes it the argument ] and you get a cryptic message

    [-z: command not found

Remember that bash tokenizes lines with whitespace and the error messages will not seem so confusing.

echo

echo concatenates all its arguments with 1 space.

    echo missing variable:    myvar

Is tokenized to [ 'echo', 'missing', 'variable:', 'myvar' ] and echo prints it as

    missing variable: myvar
   
N.B. without the extra whitespace before myvar, seems a bit confusing at first, but your script is not "stripping whitespace",  bash is tokenizing and echo is printing what it receives.

If you quote it

    echo 'missing variable:    myvar'
   
Cli is tokenized to [ 'echo', 'missing variable:    myvar' ] and now echo prints with the correct whitespace.

    missing variable:    myvar

Concatenating arguments is only a feature of echo, its not a general practice.

    git commit -m documentation changes

Will submit just "documentation" as the comment (the parameter to -m).

    git commit -m "documentation changes"

Passes the single argument with the text.

Quotes

The rules for quoting strings are different from other languages. The basic rules for quoting are simple but can lead to confusion, and there are confusing exceptions.

The basic rules are

1. everything is a string
2. spaces require quotes
3. single quotes do no escaping or replacements
4. double quotes support escaping and replacements

1 Everything is a string

There are no number data types as with other languages.

"1" and '1' and 1 are all the string "1".  Programs may interpret the strings as numbers, but bash does not.

bc is a command line calculator

    echo 1 + 1 | bc

prints

    2

bash tokenized the command to

    [ 'echo', '1', '+', '1', '|', 'bc' ]

bc converted the string "1 + 1" to number.

2 Spaces require quotes

If you want a space in a string you must quote it.

    "hello world"
   
or single quotes

    'hello world'

Unlike JavaScript and python, " and ' are very different.

3 Single quotes do no replacements or escaping

Single quotes do no changes at all to the string.

singe quotes...

    myvar=foo
    echo 'myvar: $myvar'


prints

    myvar: $myvar

double quotes...

    myvar=foo
    echo "myvar: $myvar"


prints

    myvar: foo


In single quotes there is no escaping supported  not even ', there is no way to escape ' inside single quotes. 

4 Double quotes support escaping and replacements

Double quotes do replacement magic and escaping magic.

"Replacement" replaces $var` with the value from an environment variable.

"Escaping" is prefixing some characters with \

  • \"  escapes to just "
  • \$ escapes to just $

so

    myvar=cruel
    echo "hello    \"$myvar\"    world"

   
does escaping and replacing to print

    hello    "cruel"    world

N.B. ${} syntax has a lot more magic than you might expect. Details follow.

Using the simple rules

When no quotes are supplied strings are tokenized to multiple strings, escaping and replacements work without quotes.

    myvar=foo
    echo $myvar


Its good practice to quote strings with spaces even it does not really matter as with echo.

    echo "out:failed:missing_file"


Any string can be quoted or un-quoted, its possible but bad practice to quote programs and arguments that don't have spaces.

The following is valid but looks a bit weird

    "mkdir" "logs"
    "rm" '-rf'

   
N.B. there is nothing special about args this is just a convention

    grep "-red"

Is the same as

    grep -red

grep will interpret the first parameter as argument"-", it will not search for the string "-red".

echo newlines

Newlines terminate the command line and, as with JavaScript semicolons ; are optional.

Unlike most other languages a new line is valid inside a string.

    echo '
        one
       
        two
       
        three
    '


prints the new lines.

    one

    two

    three


You may see here docs in existing scripts

    echo <<EOF
        one
       
        two
       
        three
    EOF


Single and double quotes are often more convenient.

Empty strings

Empty string can be a bit complicated

    myprog $myvar


is subtly different to

    myprog "$myvar"


"" is an empty string, it is a token. If $myvar is missing it is not an empty string.

The former will pass a single argument, the empty string, to myprog the later will not send any arguments.

This can be important if arguments are expected

    myprog -v -m $foo -b $baa


Is better written, as follows if the variables may be empty.

    myprog -v -m "$foo" -b "$baa"

Breaking the rules

Now you know the rules, and recognize how simple they are, you will be disheartened to know that there are couple of constructs that break these simple rules.  Fortunately they behave a bit more "normally".

((

In later versions of bash there is the (( construct, this is not a program called ((, its proper syntax and its parsing rules are different.

Inside double parens, the simple whitespace tokenizing rules are broken and things that look like numbers are numbers.  The rules are much more complicated, they are undocumented C-style constructs, but they are familiar if you write and C based languages like JavaScript.

Basically, you can do math, if myvar looks like a number

    ((myvar++))

works as "expected".

    ((myvar+=2))

also works as "expected" provided you were not expecting the simpler whitespace token rules.

[[

[[ is a more modern version of [ its rules are subtly different, [[ is designed to be more natural and its use is recommended. Again [[ is not a command it is proper syntax.

  • missing vars $nothere generate empty strings.
  • && and || mean "and" and "or" (they do not terminate the command)
  • < and > mean greater and less than (not redirects)
  • ~ is not $HOME (its used for regexp)
  • globs *.* and regexp don't need (so much) escaping

The result is generally code that look more like other languages, despite breaking the simple rules.

"\ "

Rule 2 "spaces require quotes" is not 100% true, you can escape space if it is outside quotes.

It is usually best to ignore this fact in scripts.

Escaping spaces looks pretty odd in scripts.

    echo hello\ world

Escaping is more useful on the cli with tab completion.

Understanding Builtins

aka command vs programs.

Typically a the first token in a command line is a single "command" and all the others are "arguments".

    echo hello world


The "command" is "echo" and the arguments are "hello" and "world".  Bash has a few "builtins" i.e. commands that do not need an external program to run. Builtins are checked first, if one does not exist, bash looks for a program.

There is a builtin called echo, were it not there, bash would look for a program called echo and would probably find /bin/echo.

The builtin echo and the program /bin/echo are similar but its important to know they are not the same thing.

Builtin rules

Builtins run inside the bash process so they can change bash's behavior and state.

Programs run in a separate process so then cannot change bash's state.


This is why commands like set and export must be builtins, they change bash's state. echo does not change bash's state, it could be a builtin or a program.

echo exists as a builtin as an optimization, it saves forking a process to print a line of output.

So next time you ask yourself how can I run a "program" that changes a variable in the current shell, you know the answer, you cant! You have to use a builtin.

There are various builtins and ways to fork processes, the following list servers to explain why builtins exist, its not comprehensive.

cd

cd changes the current directory of the current bash shell, therefor it must be a builtin.

source

source runs a whole script (excluding the shebang) inside the current process without forking, therefor it must be a builtin. The script it runs must be a bash script since it runs inside the current bash process, you cant source a python script. Source is like a dynamic include.

Because source is a builtin it can change bash's state.  So if you want to write a script that changes lots of variables in the current bash instance you should use source.

e.g. you could create a file all-my-vars.sh that has lots of handy variables like LOG=/var/log/mylog
BIN=/usr/lib/myprog/bin
etc. To use the variables in the current script or the current shell use source.

    source ./all-my-vars.sh

There is a shortcut for this the period ".".

    . ./all-my-vars.sh

Careful with the difference between
./all-my-vars.sh
and
. all-my-vars.sh
The former "runs the script", it will not affect the current instance, the latter "sources the script" and will affect the current instance.

export

    myvar=foo

Changes the environment of the current bash shell, therefor it must be a builtin.

    export myvar=foo

Changes the environment of the current bash shell, it also tells bash to set the myvar variable in any subshell it creates from now on.

    myvar=foo myprog

sets the myvar variable just for the single execution of the myprog command.

exec

exec transmogrifies the bash process into a different program.  Its pretty magic, since it very much changes the state of bash it must be a builtin.

If at the end of a script you write

    exec myprog arg1 arg2

When you run the script, the bash instance will "disappear" and will be replaced by myprog, when you look in the ps list you will not see bash.

This is handy for launch scripts.  For example if you want to write a script ./run-myprog.sh to run a program called myprog that creates a needed directory first, you could write

    #!/bin/bash

    mkdir -p /tmp/myprog
   
    myprog $1


When you run this and look in the ps list you will see.

    me  123   44  bash ./run-myprog.sh arg1
    me  124  123  myprog arg1


Using exec builtin the bash process will magically disappear
   
    #!/bin/bash

    mkdir -p /tmp/myprog
   
    exec myprog arg1


When you run this and look in the ps list you will just see.

    me  124  123  myprog arg1

If you are using lots of little scripts to setup launching a daemon, judicious use of source and exec can keep the number of bash instances in the ps list during and after the boot to a minimum.


Subshells

Subshells are bash instances started by bash, there are a few syntax options.

    bash ./myscript
    (./myscript)


Subshells are new processes, not builtins, they cannot change the current process.
You can capture the output of a subshells, you have two syntax options, backticks or the more modern dollar parens.

    output=`./myscript`
    output=$(myscript)


String manipulation

Linux provides lots of tools for processing strings in pipes. e.g. sed and awk it is tempting to use subshells and pipes to change strings

    new_string=`echo $oldstring | cut 5-10 | tr -d '\n'`

For example, if you have a variable my_file containing myprog.dat and you want to create a filename called myprog.bak.

You could solve this with subshells and Linux tools e.g.

    new_name=$(echo $my_file | sed -e 's/.dat$//')
    new_name=${new_name}.bak


There is nothing wrong with this approach except perhaps performance.  Bash ninjas would prefer not to fork processes and will tend to use the builtin string manipulation features of bash.  This is a big subject, the important thing to know is that...

  1. bash has very flexible string manipulation
  2. the syntax is screwy

The basic syntax is ${VAR [insert magic] }  where magic has operators such as  %, #, /, and :

Regexp is the easiest to remember since it uses / / similar to sed.

    new_name=${my_file/.dat/.bak}


Escaping regexp is complicated. In this case trim is more reliable.

    new_name=${my_file%.dat}.bak

I would recommend commenting such magic. % is not a common trim operator.

The performance benefits are often negligible, kudos on StackOverflow is priceless.

If you find yourself using string manipulation a lot in one particular script, it is an indicator you should port to a proper language.


PWD concenrs


Every unix process has a current working directory PWD, bash included, it changes it more often than most programs.

Usually the same command works on the command line, as in a script.

    mkdir -p target/

typed on the cli and in a script has the same effect.

Where the target dir is created is dependent on your current directory, so its best to always use absolute paths in scripts.
Or add

    cd $(dirname $0)

at the start so that all commands run from the directory where the script is located.

e.g. if you have a script called /home/me/workspace/init.sh

    #!/bin/bash
    cd $(dirname $0)
    mkdir -p target/


and you run it as ./init.sh or from home as workspace/init.sh it will have the same result.

Subshells can have a different PWD and do not affect the parent shell.  This can be useful to temporarily change directory.

You could save the PWD and cd back to it after your change...
   
    OLDPWD=$PWD
    cd /var/log/$myprogname
    cat $myprogname.log
    cd $OLDPWD
    unset OLDPWD

   
This is neater with subshells

    ( cd /var/log/$myprogname; cat $myprogname.log )

Handy for executing prgrams the are dependent on the PWD.

    ( cd src/c; make )


Or running a series of commands in a different directory
   
    (
        cd /tmp
        touch foo
        touch baa
        touch quxx
    )


Now what?

That was not really too much to digest was it. I get called over to help with tricky bash issues quite a lot and 9 times out of 10 the problems is caused by one of these issues.  Hence writing this doc.

Once you know the basics, I highly recommend you read man bash through once.  It is dull, but there you will find a ton of tricks.

For sane coding conventions I would recommend (Google's shell style guide)
Google come down hard on the the tabs vs spaces issue. It makes sense because you really want to be able to copy paste and tab is introduces a serious risk if you do that.  You really should avoid tabs in new code.

For sane design guide, KISS cannot be overstated.