Brother HL-2270DW on Linux

The Brother HL-2270DW is one of the best budget laser printers around, with third party consumables readily available on eBay.

The printer has a network port on the back of it, which is great news for GNU/Linux users, because networked printers tend to speak standard protocols.

The fastest way to get this printer working without Windows is to plug it in and log in via the web. The default settings have DHCP enabled. Here are a few pieces of crytic voodoo magic that helped me:

  1. To reset the print server settings, hold “GO” on startup, then let go and press it 6 times.
  2. The default admin login is admin / access
  3. On my printer (Firmware 1.10), printing network settings by holding “GO” for 10 seconds caused the network card (Brother NC-7800w) activity lights to go off, requiring a reboot.

On Debian, there is no “Brother HL-220DW” CUPS driver on the list, but I found the following driver to work fine (and allow duplex):

Brother HL-2170W Foomatic/hpijs-pcl5e (recommended)

To use this driver, you will need to install the HP Linux Printing and Imaging printer driver:

apt-get install printer-driver-hpijs

Crash course in handling web traffic spikes

So yesterday, there was a small earthquake in Melbourne. Within a few minutes, the Geoscience Australia web-page was delivering 503 Errors due to the load.

Why websites crash

Websites stop working under heavy load because the server doesn’t have enough resources to process everything. This is usually one of:

  1. Bandwidth saturation, indicated by timeouts and super slow load times.
  2. Webserver or cache overload, indicated by refused connections or server (5XX) errors.
  3. Database server overload, indicated by server errors.

A simple database-driven website might process a request like this:

Simple web-site setup

Set up a squid cache

For sites like the example above, a good cache setup is essential. This is another server (or server process), which serves pages that aren’t changing. A page only needs to be generated as often as it changes:

A webserver behind a cache to reduce load

squid is a good open source starting point if you are administering a server which struggles under load.

Round-robin DNS

Without running a hardware load-balancer (read: spending money), you can have clients connect to different servers by using round-robin DNS.

Each time a DNS lookup is issued, a different address can be returned, allowing you to have several caches at work.

Example of Round-Robin DNS

Missile Launcher on Raspberry Pi

This post covers a few setups to experiment with if you have a DreamCheeky USB missile launcher and a Raspberry Pi.

A newer version is being sold on ThinkGeek, but the one I used was:

DreamCheeky USB Missile Launcher

Setup 1: Direct to PC

The launcher comes with some software to let you connect it straight to a computer. Of course, USB can only go 5 metres, which is not much fun for cubicle warfare:

USB Missile Launcher setup with PC

I included this setup because it is the easiest way for Debian/Ubuntu users to test that they can use this driver, which is needed for the other setups.

Setup 2: Networked with Raspberry Pi

So for this setup, you need a Raspberry Pi Model B. They look like this:

Raspberry Pi Model B

Running Raspbian, upgrade to Debian Jessie, and compile the code:

apt-get install git libusb-1.0-0-dev libncurses-dev gcc g++
git clone --recursive https://github.com/mike42/missile
cd missile
make

You can then place the pi anywhere with network and power:

USB Missile launcher setup with Raspberry Pi

To operate the launcher remotely, use SSH to log in, and run missile/bin/keyboard-ctl.

Setup 3: Wireless with Raspberry Pi and Battery

Of course, network and power can be provided with a power bank and wifi adapter:

Power Bank for mobile phone
USB WiFi Adapter

The wifi adapter will take some work to set up (see Debian Wiki), so I wont document that here. You will need a power bank that has enough power for the Raspberry Pi with launcher and wifi. Mine had to be close to fully charged to work.

An obligatory diagram of this setup:

USB Missile launcher setup with Raspberry Pi (WiFi and Battery)

Wrap-up

The reason this helps with cubicle warfare is simple: The launcher, Raspberry Pi and battery can be fitted into a tissue box or other small space. Proof:

Box interior

On a desk you would see this as:

Box exterior
Box open

And a quick demo for completeness:

In the above, the Pi is connected to DC power, because the battery didn’t have enough juice to power the unit.

Enabling graphical boot on Debian GNU/Linux

Unlike most desktop Linuxes around today, Debian’s default boot screen is still text:

Debian's text-mode booting

I imagine that this is because there is no distinction between “Desktop” and “Server” editions in the Debian world (see tasksel), so a text-mode boot will work on every type of installation.

Luckily, if you want a graphical boot screen, you can simply apt-get install a package called plymouth and configure it according to these instructions.

The result looks more suited to a desktop PC (screen capture from here):

Debian's plymouth boot screen

Plymouth install notes

There is a comment in /etc/default/grub which suggests checking supported graphics modes, which is a Good Idea(TM):

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

The theme for “wheezy” was called Joy, so if you have desktop-base installed, you should:

/usr/sbin/plymouth-set-default-theme joy

I tried to get this working in a virtual machine to get an actual screen capture, but on KVM this appears to be quite tricky, due to emulated graphics.

How to graph ASX data with gnuplot

This post is not about economics, it’s about scripting. People who follow stocks love to see historic prices. Here I’ll show you how to get historic ASX data and do a simple plot with the wonderful open-source tool gnuplot.

Getting the data

I couldn’t figure out who runs it, but this site offers .zip files containing basic daily data, updated each weekend. The archives have CSV files in them:

ASX source data

To make these useful, I joined them together and imported them into sqlite. On Debian this is in the sqlite3 package.

To turn the .zip files into a sqlite file:

  1. Download the files for the time period you need, and put them in a folder called “data
  2. Save the script below as “import.sh” and run it.
#!/bin/sh

# Unzip all the data files and leave the text files in the "txt" folder.
rm -f asx-historic.db
rm -Rf txt
mkdir -p txt
for i in data/*;
do
	echo -n "Extracting $i .. "
	unzip -q $i -d txt
	echo "done"
done
mv txt/*/*.txt txt/
find . -empty -delete

# Combine the text files
echo -n "Combining files .. "
cat txt/*.txt > txt/asx-historic.csv
echo "done"

# Import the text files into an sqlite db
echo -n "Creating database .. "
sqlite3 asx-historic.db -batch <<EOF
create table price (code CHAR(3), date DATE, open DECIMAL(10,3), close DECIMAL(10,3), low DECIMAL(10,3), high DECIMAL(10,3), vol);
.separator ,
.import txt/asx-historic.csv price
EOF
echo "done"

After running import.sh, the data is in a file called “asx-historic.db“. You should re-run this script with extra data when it comes out.

Querying a sqlite database

That file is a database, so you can query it with SQL like so:

mike@mikebox$ sqlite3 asx-historic.db
SQLite version 3.8.1 2013-10-17 12:57:35
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select date, close from price where code='ASX' order by date;
20130603|37.68
20130604|37.1
20130605|36.64
20130606|36.4
...

Graphing closing prices

Line graphs in gnuplot are very simple. Save this file as line.gnuplot:

set terminal pdf
set output fout
set key left
plot fin using 2 w lines title code

Note: “fout” (file out) “fin” (file in) and “code” are variables.

This bash script lists closing prices for a code and saves them to a .dat file under a folder called “plot”.

#!/bin/bash
sqlite3 -separator $'t' asx-historic.db "select date, close from price where code='$1' order by date;" > plot/$1.dat
gnuplot -e "code='$1'" -e "fin='plot/$1.dat'" -e "fout='plot/$1.pdf'" line.gnuplot

An example usage would be:

./line.sh CSL

Which (given a few months of data) looked like this:

ASX chart example

File list

If you follow this from start-to-finish, then you should have the following files:

  • data/
    • (Lots of zip files)
  • plot/
    • CSL.dat
    • CSL.pdf
  • txt/
    • (Lots of text files)
  • asx-historic.db
  • import.sh
  • line.sh
  • line.gnuplot

Writing in Ancient Egyptian with HieroTeX

When I started learning Ancient Egyptian, I wanted to be able to type hieroglyphs alongside regular text, for printing translations. There is a package for the typesetting system LaTeX which does this, called “HieroTeX“. It took me a while to figure out how to use it, but the results are top-notch:

Example of HieroTex output

Because I’ve installed this on quite a few computers, I’m writing up this blog post to make it easier for other GNU/Linux users who are trying to figure it out.

Installation

This is tricky, because:

  • There is no Debian package! Uh oh.
  • Debian is phasing out tetex in favour of texlive
  • The variables.mk file needs to be edited for the install to work (diff to apply / how to apply it). This is because the default installation target is the user’s home directory.

I put togethter this script, hierotex-install-3.5.sh, which will get a working HieroTeX install on any recent version of Debian.

#!/bin/sh
# Script to download and install HieroTeX on a Debian computer.
# Use at your own risk.
#
# Some packages you should install first:
# 	apt-get install texlive make gcc

# Get and extract the files
wget -c "http://webperso.iut.univ-paris8.fr/~rosmord/archives/HieroTeX-3.5.tgz" && 
tar xvzf "HieroTeX-3.5.tgz" && 
cd HieroTeX && 
wget -c "http://webperso.iut.univ-paris8.fr/~rosmord/archives/HieroType1-3.1.4.tgz" && 
tar xvzf "HieroType1-3.1.4.tgz"

# Patch variable.mk to install for the whole system
wget http://mike42.me/blog/files/variable.patch && 
patch variable.mk < variable.patch

# Run the installer
sudo make tetex-install

Note: This page is great, but the variables.mk suggested for Debian/Ubuntu does not include the documentation folder, which will cause the installer to crash. It also suggests using tetex, which will not exist in future Debian releases! This is probably fine if you are on a .rpm-flavoured distro.

How to use

Firstly, you will need to know a little bit about the LaTeX typesetting system. See wikibooks.

HieroTeX accepts markup in Manuel de Codage format, which you will either need to learn, or get a tool which helps you mark up text in it. This Linux for Egyptologists page has some excellent suggestions.

The block of LaTeX code below is from my tex-examples repo, and was used to generate the image of Tutankhamun’s cartouche above.

\documentclass[a4paper]{article}
\usepackage{hiero}
\usepackage{egypto}
\begin{document}
	\section*{Egyptian hieroglyph example}

	\begin{hieroglyph}zA ra < i-mn:n-t-G43-t-S34 HqA-iwn-Sma >\end{hieroglyph} \\
	{\em Tutankhamun Hekaiunushema} \\
	Living Image of Amun, ruler of Upper Heliopolis
\end{document}

To build the file, you need to filter it through sesh command. Something like this would work:

cat hierotex-example.tex | sesh > hierotex-example-2.tex
latex hierotex-example-2.tex

The actual example uses a Makefile to do this.

Update May 2016: The original website for HieroTeX has gone offline, but is available via the Internet Archive: webperso.iut.univ-paris8.fr/~rosmord/archives/

Transforming between SQL dialects

I recently found myself Googling for some data voodoo. I have a web app which I want to work with locally, and the RDBMS requirements are a little bit incompatible.

Unfortunately, neither this impressive sed script nor this eloquent mix of sed, ruby and perl could do this with <100 syntax errors in the output, so I had to get creative. Here is what I have learned:

  • Simplify the problem. I decided to convert the structure to SQLite manually, as it is not likely to change. The parts you will need to convert often (thousands of INSERT statements) are the parts which are more important to have a script for. The extra mysqldump options for getting the data only, without nasty `backticks` were:
    --compatible=ansi --skip-extended-insert --compact --no-create-info
  • Use sed to fix the escaping. MySQL escapes single quotes with ‘, and double quotes with ” but SQLite uses ” and “. This one-liner made the conversion:
    sed -e "s/\'/''/g" -e 's/\"/"/g' db.sql > db.sqlite

The resulting file could have the structure cat‘d on to the start and imported into SQLite.

How to liberate your myki data

myki logo

myki is the public transport ticketing system in Melbourne. If you register your myki, you can view the usage history online. Unfortunately, you are limited to paging through HTML, or downloading a PDF.

This post will show you how to get your myki history into a CSV file on a GNU/Linux computer, so that you can analyse it with your favourite spreadsheet/database program.

Get your data as PDFs

Firstly, you need to register your myki, log in, and export your history. The web interface seemed to give you the right data if you chose blocks of 1 month.

Export myki data for each month

Once you do this, organise these into a folder filled with statements.

A folder filled with myki statements

You need the pdftotext utility to go on. In debian, this is in the poppler-utils package.

The manual steps below run you through how to extract the data, and at the bottom of the screen there are some scripts I’ve put together to do this automatically.

Manual steps to extract your data

These steps are basically a crash course in "scraping" PDF files.

To convert all of the PDF’s to text, run:

for i in *.pdf; do pdftotext -layout -nopgbrk $i; done

This preserves the line-based layout. The next step is to filter out the lines which don’t contain data. Each line we’re interested in begins with a date, followed by the word “Touch On”, “Touch Off”, or “Top Up”

18/08/2013 13:41:20   T...

We can filter all of the text files using grep, and a regex to match this:

cat *.txt | grep "^[0-3][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9] *T"

The output looks like:
Filtered output, showing data

So what are we looking at?

  1. One row per line
  2. Fields delimited by multiple spaces

To collapse every double-space into a tab, we use unexpand. Then, to collapse duplicate tabs, we use tr:

cat filtered-data.txt | unexpand -t 2 | tr -s '\t'

Finally, some fields need to be quoted, and tabs need to be converted to CSV. The PHP script below will do that step.

Scripts to get your data

myki2csv.sh is a script which performs the above manual steps:

#!/bin/bash
# Convert myki history from PDF to CSV
#	(c) Michael Billington < michael.billington@gmail.com >
#	MIT Licence
hash pdftotext || exit 1
hash unexpand || exit 1
pdftotext -layout -nopgbrk $1 - | \
	grep "^[0-3][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9] *T" | \
	unexpand -t2 | \
	tr -s '\t' | \
	./tab2csv.php > ${1%.pdf}.csv

tab2csv.php is called at the end of the above script, to turn the result into a well-formed CSV file:

#!/usr/bin/env php
<?php
/* Generate well-formed CSV from dodgy tab-delimitted data
	(c) Michael Billington < michael.billington@gmail.com >
	MIT Licence */
$in = fopen("php://stdin", "r");
$out = fopen("php://stdout", "w");
while($line = fgets($in)) {
	$a = explode("\t", $line);
	foreach($a as $key => $value) {
		$a[$key]=trim($value);
		/* Quote out ",", and escape "" */
		if(!(strpos($value, "\"") === false &&
				strpos($value, ",") === false)) {
			$a[$key] = "\"".str_replace("\"", "\"\"", $a[$key])."\"";
		}
	}
	$line = implode(",", $a) . "\r\n";
	fwrite($out, $line);
}

Invocation

Call script on a single foo.pdf to get foo.csv:

./myki2csv.sh foo.pdf

Convert all PDF’s to CSV and then join them:

for i in *.pdf; do ./myki2csv.sh $i; done
tac *.csv > my-myki-data.csv

Importing into LibreOffice

The first field must be marked as a DD/MM/YYYY date, and the “zones” need to be marked as text (so that “1/2” isn’t treated as a fraction!)

These are my import settings:

Options to import the myki data into LibreOffice

Happy data analysis!

Update 2013-09-18: The -nopgbrk option was added to the above instructions, to prevent page break characters causing grep to skip one valid line per page

Update 2014-05-04: The code for the above, as well as this follow-up post are now available on github.

QJoyPad coolness

I got a USB SNES-controller imitation from the Internet a while back for controlling a missile launcher, and recently decided to re-purpose it for controlling a GNU/Linux computer. After all VLC is great, but plugging in a keyboard is not so great!

The gamepad is apparently a USB joystick in disguise. From lsusb:

Bus 003 Device 004: ID 0079:0011 DragonRise Inc. Gamepad

The only packaged program for doing this in Debian was joy2key. It was too cryptic for me to figure out in <5 minutes, so I tossed it. Google turned up xjoypad, jkeys and jscal as suggestions, but QJoypad looked the most promising, and is as simple as a program should be.

To compile it, you need the QT development libraries, and an X library called libxtst-dev

The profile in the screenshot (called “VLC”) controls the mouse, pauses, adjusts volume, and toggles fullscreen. It works well enough for media and web browsing, as long as you don’t need to type anything!

Bugs noticed:

  • Can’t set a button to do Ctrl+<key>, only the key on its own.

Making an XKCD-style password generator in C++

I’m learning C++ at the moment, and I don’t find long tutorials or studying the standard template library particularly fun.

Making this type of password-generator is not new, but it is a nice practical exercise to start out in any language.

1. Get a list of common English words

Googling “common English words” yielded this list, purporting to contain 5,000 words. Unfortunately it contains almost 1,000 duplicates and numerous non-words! Wiktionary has a much higher-quality list of words compiled from Project Gutenberg, but the markup looks a bit like this:

==== 1 - 1000 ====
===== 1 - 100 =====
[[the]] = 56271872
[[of]] = 33950064
[[and]] = 29944184
[[to]] = 25956096
[[in]] = 17420636
[[I]] = 11764797  

Noting the wikilinks surrounding each word, I put together this PHP script to extract the link destinations and called it get-wikilinks.php:

#!/usr/bin/php
<?php
/* Return list of wikilinked words from input text */
$text = explode("[[", file_get_contents("php://stdin"));
foreach($text as $link) {
	$rbrace = strpos($link, "]]");
	if(!$rbrace === false) {
		/* Also escape on [[foo|bar]] links */
		$pipe = strpos($link, "|");
		if(!$pipe === false && $pipe < $rbrace) {
			$rbrace = $pipe;
		}
		$word = trim(substr($link, 0, $rbrace))."n";
		if(strpos($word, "'") === false && !is_numeric(substr($word, 0, 1))) {
			/* Leave out words with apostrophes or starting with numbers */
			echo $word;
		}
	}
}

The output of this script is much more workable:

$ chmod +x get-wikilinks.php
$ cat wikt.txt | ./get-wikilinks.php
the
of
and
to
in
I

Using sort and uniq makes a top-notch list of common words, ready for an app to digest:

$ cat wikt.txt | ./get-wikilinks.php | sort | uniq > wordlist.txt

2. Write some C++

There are two problems being solved here:

  • Reading a file into memory
    • An ifstream is used to access the file, and getline() will return false when EOF has been reached
    • Each line is loaded into a vector (roughly the same type of container as an ArrayList in Java), which is resized dynamically and accessed like an array.
  • Choosing random numbers
    • These are seeded from a random_device, being more cross-platform than reading from a file like /dev/urandom.
    • Note that random is new to C++11.
pw.cpp
#include <fstream>
#include <vector>
#include <string>
#include <iostream>
#include <random>
#include <cstdlib>

using namespace std;

int main(int argc, char* argv[]) {
    const char* fname = "wordlist.txt";

    /* Parse command-line arguments */
    int max = 1;
    if(argc == 2) {
        max = atoi(argv[1]);
    }

    /* Open word list file */
    ifstream input;
    input.open(fname);
    if(input.fail()) {
        cerr << "ERROR: Failed to open " << fname << endl;
    }

    /* Read to end and load words */
    vector<string> wordList;
    string line;
    while(getline(input, line)) {
        wordList.push_back(line);
    }

    /* Seed from random device */
    random_device rd;
    default_random_engine gen;
    gen.seed(rd());
    uniform_int_distribution<int> dist(0, wordList.size() - 1);

    /* Output as many passwords as required */
    const int pwLen = 4;
    int wordId, i, j;
    for(i = 0; i < max; i++) {
        for(j = 0; j < pwLen; j++) {
            cout << wordList[dist(gen)] << ((j != pwLen - 1) ? " " : "");
        }
        cout << endl;
    }

    return 0;
}

3. Compile

Lots of projects in compiled languages have a Makefile, so that you can compile them without having to type all the compiler options manually.

Makefiles are a bit heavy to learn properly, but for a project this tiny, something simple is fine:

default:
	g++ pw.cpp -o pw -std=c++11

clean:
	rm -f pw

Now we can compile and run the generator:

make
./pw

The output looks like this for ./pw 30 ("generate 30 passwords"):