Tuesday 12 June 2018

Installing Ubuntu without systemd

WARNING don't copy commands on this page blindly, you can easily end up messing your install. e.g. rm /sbin/init in the host system and not the container rootfs would not be nice.

The aim of this is to create and Ubuntu build server with libc, python, nodejs, build-essentials,  all at exactly the same version as Ubuntu but without systemd, using apt to install everything.

My goal is to run LXC containers with as little RAM overhead as possible.  The container host I'm running on has 500MB of RAM, I'm quite tight with resources.

TL;DR
  1. create lxinitd based lxc container
  2. run the ubuntu lxc template script
  3. revert lxinitd's /sbin/init and /etc/rc.local
  4. add core services to /etc/rc.local replacing systemd inits

Step 1 Create the lxinitd based server.

 

Install lxinitd.

Create a container called "ci".

lxc-create -t lxinitd -n ci -f /etc/lxc/default.conf

At this point I then added some mounts, and I remove the /lib /lib64 and /usr/lib mounts defined in /var/lib/lxc/ci/config
In this case I want the container to have its own copy of all libs installed via .debs, by default, lxinitd mount libs from the container host. 

I then eddited lxinitd's boot script $rootfs/etc/rc.local, you can have lxc do the networking but I prefer to do it myself with ip, I'm not using Ubuntu/Debian style /etc/network/interfaces or dhcp.

My $rootfs/etc/rc.local looks like this

#!/bin/lxinitd
respawn /sbin/getty -L tty1 115200 vt100
/sbin/ip addr add $ip_address/24 dev eth0
/sbin/ip route add default via 10.0.3.1


You may want to do something different with your base lxc container.

Step 2 Overlay the Ubuntu container bootstrap


First backup $rootfs/etc/rc.local since debootstrapping will overwrite it.
Remove the $rootfs/sbin/init symlink to lxinitd.

mv $rootfs/etc/rc.local $rootfs/etc/rc.local.tmp
rm $rootfs/sbin/init


Take a copy of /usr/share/lxc/templates/lxc-ubuntu so it can be edited an run independently of lxc.

The main change to make to this script is to permanently disable running systemd services when apt installs applications.
The default lxc-ubuntu template already disables running systemd servers, but it enables it again for the final configuration.

echo exit 101 > $rootfs/usr/sbin/policy-rc.d
chmod +x $rootfs/usr/sbin/policy-rc.d


By leaving and executable $rootfs/usr/sbin/policy-rc.d in the filesystem apt knows not to try to run commands during installations, this is necessary when you are installing in a chroot.

Edit the parts of the lxc-ubuntu script which remove policy-rc.d.

With policy-rc.d changes made, run the script as follows, add any of the packages you need in the base system. I'm adding sshd here but you don't need todo that.

./lxc-ubuntu.sh --name ci --rootfs /var/lib/lxc/ci/rootfs --path /var/lib/lxc/ci --packages openssh-server

Step 3 Revert the lxinitd init process


The next step is to restore lxinit's init, first remove /sbin/init (which will be a symlink to /bin/systemd), symlink /sbin/init to /bin/lxinitd, then copy back our rc.local.

cd $rootfs/sbin
rm init
ln -s ../bin/lxinitd init
mv $rootfs/etc/rc.local.tmp $rootfs/etc/rc.local


Step 4 Configure systemd services replacements

 

To be honest I thought this stage would be painful, perhaps impossible.  But it was very easy.

systemd peeps decided to own DNS resolving, which is arguably not the remit of an init system, this container does not have systemd-resolved running.
The fix is trivial, just put a real DNS server in /etc/resolv.conf, this one is the OpenDNS public server.

echo 'nameserver 208.67.222.222' > etc/resolv.conf

Obviously if you have a local DNS or dnsmasq running on the container host or local network, use that.

Ensure that /etc/hostname and /etc/hosts and lxc.utsname report the same string for the local hostname.

Configure syslog, crond and sshd, by adding a couple of lines to /etc/rc.local.

mkdir -p /var/run/sshd

service /var/run/crond.pid /usr/sbin/cron -f
service /var/run/rsyslogd.pid /usr/sbin/rsyslogd
service /var/run/sshd.pid /usr/sbin/sshd -D


I really like the fact that setting up services in lxinitd is a one-liner.

And that was it. Nothing more was needed to replace the init system entirely.  systemd is installed and all its many libs are there on the filesystem but they don't boot.

lxc-start -n ci

runs lxinitd and boots services in the container and systemd does not get a look-in.

The ps list of this container looks like this.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0    208     4 ?        Ss   19:42   0:00 /sbin/init
root        28  0.0  0.0  26072  2456 ?        S    19:42   0:00 /usr/sbin/cron -f
root        30  0.0  0.0  65516  5580 ?        S    19:42   0:00 /usr/sbin/sshd -D
syslog      31  0.0  0.0 256400  2752 ?        Ssl  19:42   0:00 /usr/sbin/rsyslogd


i.e. all things that I'minterested inand nothing else.

Equivalent Ubuntu base container ps list looks like this.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.7 154448  3752 ?        Ss   May24   0:32 /sbin/init
root        38  0.0  0.8  79480  4420 ?        Ss   May24   0:09 /lib/systemd/systemd-journald
systemd+    42  0.0  0.1  74464   644 ?        Ss   May24   0:07 /lib/systemd/systemd-networkd
message+    66  0.0  0.3  47448  1700 ?        Ss   May24   0:01 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root        69  0.0  0.1  31208   712 ?        Ss   May24   0:04 /usr/sbin/cron -f
systemd+    75  0.0  0.4  65696  2072 ?        Ss   May24   0:08 /lib/systemd/systemd-resolved
root        89  0.0  0.2  72136  1024 ?        Ss   May24   0:00 /usr/sbin/sshd -D
ci          93  0.0  0.1  79980   828 ?        Ss   May24   0:00 /lib/systemd/systemd --user
ci          94  0.0  0.3 103780  1624 ?        S    May24   0:00 (sd-pam)
root        96  0.0  0.0  15876   136 pts/0    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/0 115200,38400,9600 vt220
root        97  0.0  0.0  15876   132 pts/2    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/2 115200,38400,9600 vt220
root        98  0.0  0.0  15876   132 pts/1    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud console 115200,38400,9600 vt220
root        99  0.0  0.0  15876   140 pts/1    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/1 115200,38400,9600 vt220
root       100  0.0  0.0  15876   136 pts/3    Ss+  May24   0:00 /sbin/agetty --noclear --keep-baud pts/3 115200,38400,9600 vt220
syslog    3619  0.0  0.1 256536   664 ?        Ssl  May24   0:01 /usr/sbin/rsyslogd -n
root     11670  0.0  0.4  65796  2216 ?        Ss   May25   0:03 /lib/systemd/systemd-logind



I dont have dbus, systemd-journald, systemd-networkd, systemd-resolved, systemd-logind, & 5 agetty instances.

on the systemd based lxc container I have ~100 systemd units active, on my lxinitd based container I have

linci> sudo systemctl
Failed to connect to bus: No such file or directory


Next step is to have one instance of rsyslog in the container host system, there is no real need to have it running inside each container.

In a container there is often no need to run cron, sshd or rsyslog, in which case, the base container is taking only 4k of RAM (RSS 4), yet this is a full OS Ubuntu and you can

apt install whatever

I add a other services to the base container to support my builds, xtomp, nginx, fcgiwrap, ngircd, ii, tsp.  All of which are installed with

apt install xxx



and a oneliner in /etc/rc.local.

Clearly lots of things in the official repos are going to be broken .  You could not boot Gnome3 off an lxc / lxinitd based container.  I doubt such trickery would work with a RedHat sponsored distro.

For an Ubuntu like buildbox this is perfect.


Monday 26 March 2018

Writing bash scripts

bash and shell scripting seems to be a problem for many people, including capable programmers in other languages.

This guide is not a list if tips and tricks. Its a description of the syntax and a list of things I wish someone had explained to me before I started.

I've had a fair bit of experience with bash over the years, hacked at the C code added features and extensions.  None of those extensions I will recommend. This post is about writing vanilla bash scripts.

When to use bash


From Google's bash style guide: "Shell should only be used for small utilities or simple wrapper scripts."

There is no loss in starting a util with bash and, if it grows up, porting it to a "proper" language.

That said, it is difficult to have to deal with little utils and applications written in esoteric languages or compiled C that could have been a shell script.

One on the main reasons to use shell is that it is decomposable.

You run it, it works: and vim will tell you how. Copy paste will let you run parts of it.  This is a pretty neat feature and not to be scoffed at.

The main rule for bash code is to keep it simple, that does not mean you cannot achieve great things.  I have written a replacement for Jenkins on a raspberrypi as a few lines of bash and, famously, git was written as shell scripts.

ToC


  • Syntax / whitespace rules
  • How quotes work
  • Why builtins exist
  • pwd

Syntax

Syntax seems deceptively simple, many folk tear their hair out with quotes, $var, and whitespace, because they have not yet assimilated the rules.

A shell script is generally a text file made up of command lines to execute, tokenized by space (seems obvious right) but it is not always.

    myprog arg1 arg2
   
e.g.
   
    mkdir -p target/

N.B. this is split by bash into an array [ 'mkdir' ,'-p', 'target/' ]

There are times when bash's behavior seems erratic but it is quite consistent.


Assignments


The statement

    myvar=foo

is tokenized to a _single_ token [ 'myvar=foo' ] which is a command that bash knows how to interpret, the string foo is assigned to a variable called myvar.

If we change the whitespace

    myvar = foo

is tokenized to [ 'myvar', '=', 'foo'] bash tries to execute a command called myvar passing the parameters = and foo.

If you are lucky myvar does _not_ exist and you get an error, if you are unlucky myvar exists and is executed.

This is tends to be confusing because other languages parse both
a=b
and a = b as 3 tokens.

bash rules are quite simple, but different, the command line is tokenized with whitespace.

[


Take, for example, the following test

    [ -z "$myvar" ] && echo missing variable:     myvar

-z means zero length string, this line checks to see if myvar is empty and if so, it is prints a message  missing variable: myvar.

There are two things that confuse people here...

The following does not work

    [-z $myvar] && echo missing variable:     myvar

[ is a command called [  it _requires_ whitespace to tokenize. 
assuming myvar is missing [-z $myvar]  is tokenized to [ '[-z' ,']' ] i.e. bash looks for a program called  [-z and passes it the argument ] and you get a cryptic message

    [-z: command not found

Remember that bash tokenizes lines with whitespace and the error messages will not seem so confusing.

echo

echo concatenates all its arguments with 1 space.

    echo missing variable:    myvar

Is tokenized to [ 'echo', 'missing', 'variable:', 'myvar' ] and echo prints it as

    missing variable: myvar
   
N.B. without the extra whitespace before myvar, seems a bit confusing at first, but your script is not "stripping whitespace",  bash is tokenizing and echo is printing what it receives.

If you quote it

    echo 'missing variable:    myvar'
   
Cli is tokenized to [ 'echo', 'missing variable:    myvar' ] and now echo prints with the correct whitespace.

    missing variable:    myvar

Concatenating arguments is only a feature of echo, its not a general practice.

    git commit -m documentation changes

Will submit just "documentation" as the comment (the parameter to -m).

    git commit -m "documentation changes"

Passes the single argument with the text.

Quotes

The rules for quoting strings are different from other languages. The basic rules for quoting are simple but can lead to confusion, and there are confusing exceptions.

The basic rules are

1. everything is a string
2. spaces require quotes
3. single quotes do no escaping or replacements
4. double quotes support escaping and replacements

1 Everything is a string

There are no number data types as with other languages.

"1" and '1' and 1 are all the string "1".  Programs may interpret the strings as numbers, but bash does not.

bc is a command line calculator

    echo 1 + 1 | bc

prints

    2

bash tokenized the command to

    [ 'echo', '1', '+', '1', '|', 'bc' ]

bc converted the string "1 + 1" to number.

2 Spaces require quotes

If you want a space in a string you must quote it.

    "hello world"
   
or single quotes

    'hello world'

Unlike JavaScript and python, " and ' are very different.

3 Single quotes do no replacements or escaping

Single quotes do no changes at all to the string.

singe quotes...

    myvar=foo
    echo 'myvar: $myvar'


prints

    myvar: $myvar

double quotes...

    myvar=foo
    echo "myvar: $myvar"


prints

    myvar: foo


In single quotes there is no escaping supported  not even ', there is no way to escape ' inside single quotes. 

4 Double quotes support escaping and replacements

Double quotes do replacement magic and escaping magic.

"Replacement" replaces $var` with the value from an environment variable.

"Escaping" is prefixing some characters with \

  • \"  escapes to just "
  • \$ escapes to just $

so

    myvar=cruel
    echo "hello    \"$myvar\"    world"

   
does escaping and replacing to print

    hello    "cruel"    world

N.B. ${} syntax has a lot more magic than you might expect. Details follow.

Using the simple rules

When no quotes are supplied strings are tokenized to multiple strings, escaping and replacements work without quotes.

    myvar=foo
    echo $myvar


Its good practice to quote strings with spaces even it does not really matter as with echo.

    echo "out:failed:missing_file"


Any string can be quoted or un-quoted, its possible but bad practice to quote programs and arguments that don't have spaces.

The following is valid but looks a bit weird

    "mkdir" "logs"
    "rm" '-rf'

   
N.B. there is nothing special about args this is just a convention

    grep "-red"

Is the same as

    grep -red

grep will interpret the first parameter as argument"-", it will not search for the string "-red".

echo newlines

Newlines terminate the command line and, as with JavaScript semicolons ; are optional.

Unlike most other languages a new line is valid inside a string.

    echo '
        one
       
        two
       
        three
    '


prints the new lines.

    one

    two

    three


You may see here docs in existing scripts

    echo <<EOF
        one
       
        two
       
        three
    EOF


Single and double quotes are often more convenient.

Empty strings

Empty string can be a bit complicated

    myprog $myvar


is subtly different to

    myprog "$myvar"


"" is an empty string, it is a token. If $myvar is missing it is not an empty string.

The former will pass a single argument, the empty string, to myprog the later will not send any arguments.

This can be important if arguments are expected

    myprog -v -m $foo -b $baa


Is better written, as follows if the variables may be empty.

    myprog -v -m "$foo" -b "$baa"

Breaking the rules

Now you know the rules, and recognize how simple they are, you will be disheartened to know that there are couple of constructs that break these simple rules.  Fortunately they behave a bit more "normally".

((

In later versions of bash there is the (( construct, this is not a program called ((, its proper syntax and its parsing rules are different.

Inside double parens, the simple whitespace tokenizing rules are broken and things that look like numbers are numbers.  The rules are much more complicated, they are undocumented C-style constructs, but they are familiar if you write and C based languages like JavaScript.

Basically, you can do math, if myvar looks like a number

    ((myvar++))

works as "expected".

    ((myvar+=2))

also works as "expected" provided you were not expecting the simpler whitespace token rules.

[[

[[ is a more modern version of [ its rules are subtly different, [[ is designed to be more natural and its use is recommended. Again [[ is not a command it is proper syntax.

  • missing vars $nothere generate empty strings.
  • && and || mean "and" and "or" (they do not terminate the command)
  • < and > mean greater and less than (not redirects)
  • ~ is not $HOME (its used for regexp)
  • globs *.* and regexp don't need (so much) escaping

The result is generally code that look more like other languages, despite breaking the simple rules.

"\ "

Rule 2 "spaces require quotes" is not 100% true, you can escape space if it is outside quotes.

It is usually best to ignore this fact in scripts.

Escaping spaces looks pretty odd in scripts.

    echo hello\ world

Escaping is more useful on the cli with tab completion.

Understanding Builtins

aka command vs programs.

Typically a the first token in a command line is a single "command" and all the others are "arguments".

    echo hello world


The "command" is "echo" and the arguments are "hello" and "world".  Bash has a few "builtins" i.e. commands that do not need an external program to run. Builtins are checked first, if one does not exist, bash looks for a program.

There is a builtin called echo, were it not there, bash would look for a program called echo and would probably find /bin/echo.

The builtin echo and the program /bin/echo are similar but its important to know they are not the same thing.

Builtin rules

Builtins run inside the bash process so they can change bash's behavior and state.

Programs run in a separate process so then cannot change bash's state.


This is why commands like set and export must be builtins, they change bash's state. echo does not change bash's state, it could be a builtin or a program.

echo exists as a builtin as an optimization, it saves forking a process to print a line of output.

So next time you ask yourself how can I run a "program" that changes a variable in the current shell, you know the answer, you cant! You have to use a builtin.

There are various builtins and ways to fork processes, the following list servers to explain why builtins exist, its not comprehensive.

cd

cd changes the current directory of the current bash shell, therefor it must be a builtin.

source

source runs a whole script (excluding the shebang) inside the current process without forking, therefor it must be a builtin. The script it runs must be a bash script since it runs inside the current bash process, you cant source a python script. Source is like a dynamic include.

Because source is a builtin it can change bash's state.  So if you want to write a script that changes lots of variables in the current bash instance you should use source.

e.g. you could create a file all-my-vars.sh that has lots of handy variables like LOG=/var/log/mylog
BIN=/usr/lib/myprog/bin
etc. To use the variables in the current script or the current shell use source.

    source ./all-my-vars.sh

There is a shortcut for this the period ".".

    . ./all-my-vars.sh

Careful with the difference between
./all-my-vars.sh
and
. all-my-vars.sh
The former "runs the script", it will not affect the current instance, the latter "sources the script" and will affect the current instance.

export

    myvar=foo

Changes the environment of the current bash shell, therefor it must be a builtin.

    export myvar=foo

Changes the environment of the current bash shell, it also tells bash to set the myvar variable in any subshell it creates from now on.

    myvar=foo myprog

sets the myvar variable just for the single execution of the myprog command.

exec

exec transmogrifies the bash process into a different program.  Its pretty magic, since it very much changes the state of bash it must be a builtin.

If at the end of a script you write

    exec myprog arg1 arg2

When you run the script, the bash instance will "disappear" and will be replaced by myprog, when you look in the ps list you will not see bash.

This is handy for launch scripts.  For example if you want to write a script ./run-myprog.sh to run a program called myprog that creates a needed directory first, you could write

    #!/bin/bash

    mkdir -p /tmp/myprog
   
    myprog $1


When you run this and look in the ps list you will see.

    me  123   44  bash ./run-myprog.sh arg1
    me  124  123  myprog arg1


Using exec builtin the bash process will magically disappear
   
    #!/bin/bash

    mkdir -p /tmp/myprog
   
    exec myprog arg1


When you run this and look in the ps list you will just see.

    me  124  123  myprog arg1

If you are using lots of little scripts to setup launching a daemon, judicious use of source and exec can keep the number of bash instances in the ps list during and after the boot to a minimum.


Subshells

Subshells are bash instances started by bash, there are a few syntax options.

    bash ./myscript
    (./myscript)


Subshells are new processes, not builtins, they cannot change the current process.
You can capture the output of a subshells, you have two syntax options, backticks or the more modern dollar parens.

    output=`./myscript`
    output=$(myscript)


String manipulation

Linux provides lots of tools for processing strings in pipes. e.g. sed and awk it is tempting to use subshells and pipes to change strings

    new_string=`echo $oldstring | cut 5-10 | tr -d '\n'`

For example, if you have a variable my_file containing myprog.dat and you want to create a filename called myprog.bak.

You could solve this with subshells and Linux tools e.g.

    new_name=$(echo $my_file | sed -e 's/.dat$//')
    new_name=${new_name}.bak


There is nothing wrong with this approach except perhaps performance.  Bash ninjas would prefer not to fork processes and will tend to use the builtin string manipulation features of bash.  This is a big subject, the important thing to know is that...

  1. bash has very flexible string manipulation
  2. the syntax is screwy

The basic syntax is ${VAR [insert magic] }  where magic has operators such as  %, #, /, and :

Regexp is the easiest to remember since it uses / / similar to sed.

    new_name=${my_file/.dat/.bak}


Escaping regexp is complicated. In this case trim is more reliable.

    new_name=${my_file%.dat}.bak

I would recommend commenting such magic. % is not a common trim operator.

The performance benefits are often negligible, kudos on StackOverflow is priceless.

If you find yourself using string manipulation a lot in one particular script, it is an indicator you should port to a proper language.


PWD concenrs


Every unix process has a current working directory PWD, bash included, it changes it more often than most programs.

Usually the same command works on the command line, as in a script.

    mkdir -p target/

typed on the cli and in a script has the same effect.

Where the target dir is created is dependent on your current directory, so its best to always use absolute paths in scripts.
Or add

    cd $(dirname $0)

at the start so that all commands run from the directory where the script is located.

e.g. if you have a script called /home/me/workspace/init.sh

    #!/bin/bash
    cd $(dirname $0)
    mkdir -p target/


and you run it as ./init.sh or from home as workspace/init.sh it will have the same result.

Subshells can have a different PWD and do not affect the parent shell.  This can be useful to temporarily change directory.

You could save the PWD and cd back to it after your change...
   
    OLDPWD=$PWD
    cd /var/log/$myprogname
    cat $myprogname.log
    cd $OLDPWD
    unset OLDPWD

   
This is neater with subshells

    ( cd /var/log/$myprogname; cat $myprogname.log )

Handy for executing prgrams the are dependent on the PWD.

    ( cd src/c; make )


Or running a series of commands in a different directory
   
    (
        cd /tmp
        touch foo
        touch baa
        touch quxx
    )


Now what?

That was not really too much to digest was it. I get called over to help with tricky bash issues quite a lot and 9 times out of 10 the problems is caused by one of these issues.  Hence writing this doc.

Once you know the basics, I highly recommend you read man bash through once.  It is dull, but there you will find a ton of tricks.

For sane coding conventions I would recommend (Google's shell style guide)
Google come down hard on the the tabs vs spaces issue. It makes sense because you really want to be able to copy paste and tab is introduces a serious risk if you do that.  You really should avoid tabs in new code.

For sane design guide, KISS cannot be overstated.




Saturday 13 January 2018

Migrating from Perforce to Bazaar


I've a big fan of Perforce for ages.

Recently Perforce changed the licensing from 20 free users to 5. This broke my backups.  Time to switch to something FOSS.
I decided to give Bazaar a crack, on paper it ticks all the boxes.

Below is how I exported from Perforce (with history) in to a new Bazaar repo and setup triggers/hooks to run builds after code is submitted.

Export repos from Perforce


Due to the nature of my repo layout, I have I need to export each of my projects individually.  This gives me the chance to migrate one project at a time.

Took a while to decipher the docs for P4 exporting to git fast-import format, its a bit broken, but command below enabled and export and import including change history.

sudo aptitude install bzr
sudo aptitude install perforce-p4python python-fastimport
mkdir -p ~/.bazaar/plugins
cd ~/.bazaar/plugins
bzr branch lp:bzrp4
bzr branch lp:bzr-fastimport
# rename the repo
mv bzr-fastimport fastimport


Then you need to set your normal P4_xx environment and p4 login

N.B. the documented bzr fast-export-from-p4 does not seem to work.

With taht setup the following got me a local bzr repo (cairoruler is a C project I decided to test with)

~/.bazaar/plugins/bzrp4/git_p4.py export //depot/cairoruler@all > cairoruler.fi
bzr fast-import cairoruler.fi cairoruler.bzr


This generates a directory ./cairoruler.bzr/p4/master.remote .  Careful the ./cairoruler.bzr looks like a repo (it has .bzr) but its ./cairoruler.bzr/p4/master.remote that has the newly imported workspace.

The p4 export code is based on the git export tool, hence the odd name git_p4.py.

So far this is only working on my local HD.

Setting up the bzr server

I created a full OS in a Linux container, any Linux VM or baremetal server will do.
LXC gives me a playground like a VM but faster to build.

lxc-create -t ubuntu -n bzf -f lxc-config.conf

Then install in the container bzr with just

apt-get install bzr

In theory thats all you need to do to use bzr over SSH.

There are other server mechanisms but, be warned, bzr default protocols are not secure.
bzr serve --directory=/srv/bzr/repo
should not be exposed on the Internet.  You need to use SSH or do your own SSH tunneling.

Since I'm running a LXC container with ssh used for other stuff I have to NAT a port (9922) on my LXC Host to port 22 on the LXC Guest.

iptables -t nat -A PREROUTING -p tcp -i eth0 -d $PUBLIC_IP_ADDRESS --dport 9922 -j DNAT --to $BZR_IP_ADDRESS:22
iptables -A FORWARD -p tcp -d $BZR_IP_ADDRESS --dport 9922 -j ACCEPT


Thus BZR ULRs look like this.

bzr co bzr+ssh://me@myhost:9922/home/bzr/myrepo myrepo


Uploading to the server

Bazaar uses normal Linux user auth so first thing you'll want to do is create users and groups for uploads and share SSH keys.
This does not give you any fine grained read/write control to each project, tan pis.

There is no special command for setting a default remote location in bzr just push

bzr push bzr+ssh://me@myhost:9922/home/bzr/myrepo/cairoruler

Porting Triggers to Hooks

Perforce triggers are one-liners that run a command when you submit code.

Bzr has no triggers but it has Hooks.  Its all pretty hacky I'm afraid.

You have to write hooks as Python libraries and install them in you bzr deployment.  N.B. they are not part of backup/restore.

Most hooks run on the client.
In order to write a hook that runs on the server to build code when its submitted you have to write a single global post_change_branch_tip(). This is the same hook for all branches and submits, so the first thing the code has to do is work out which vbranch has changed.
When the hook is called it is passed the variable ChangeBranchTipParams, which is very poorly documented.
To debug you can't user python's print function, if you do you mess up the bzr protocol, so you need to write debug to a file.

After a couple of hours of hacking I got this

/usr/lib/python2.7/dist-packages/bzrlib/plugins/linci_hook.py



"""A linci hook.
"""

from bzrlib import branch
import subprocess
import datetime

#
# params == bzrlib.branch.ChangeBranchTipParams
#   params.branch    The branch being changed. An object of type bzrlib.branch.BzrBranch7
#   params.branch.base is a "URL" filtered-139959297573392:///var/bzr/root/cairoruler/ from which we can extract the url
#   params.old_revno    Revision number before the change.
#   params.new_revno    Revision number after the change.
#   params.old_revid    Tip revision id before the change.
#   params.new_revid    revision id after the change
#
def post_change_branch_tip(params):
    #
    # post_change_branch_tip is called under various circumstances, only fire if the rev no has changed.
    #
    if params.old_revno < params.new_revno:
        file = open("/var/log/bzr-comitlog", "a")
        file.write("%s %s\n" % (datetime.datetime.now(), params.branch.base))
        path = params.branch.base.split('/')
        branch_name = path[len(path) - 1]
        if not branch_name:
            branch_name = path[len(path) - 2]
        if not branch_name:
            return
        if branch_name in ['cairoruler']:
            file.write(datetime.datetime.now() + " executing: ci.sh " + branch_name + "\n")
            file.close()
            subprocess.call('/home/bzr/bin/ci.sh ' + branch_name, shell=True)

#
# register hook, 3rd parameter is what is shown by `bzr hooks`
#
branch.Branch.hooks.install_named_hook('post_change_branch_tip', post_change_branch_tip, 'linci post change hook')


Install this with

python /usr/lib/python2.7/dist-packages/bzrlib/plugins/linci_hook.py

Remember the script runs with the Linux permissions of the user that did the submit.

So now

bzr push


Sends latest changes and my linci server builds then and publishes a .deb as it did with Perforce.

Now I need a Sunday without a hangover to port the rest of my projects.  It should be a scriptable process.


References:
https://launchpad.net/bzr-fastimport

http://doc.bazaar.canonical.com/bzr-0.12/server.htm
https://launchpad.net/bzrp4
http://people.canonical.com/~mwh/bzrlibapi/bzrlib.branch.html

Plus some help from folks on #bzr channel on irc.freenode.net