Introducing BigInteger

Most ordinary-scale programs don’t use very large numbers, but today I’ll show you how to use Java to do some more serious calculating, with the Fibonacci numbers as an example:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, …

The numbers in this sequence always get larger, so you don’t need to be a software engineer to see that int and long wont be big enough. Just to highlight the problem, here is a naive approach to listing the Fibonacci numbers:

class FibonacciLong {
	static long solutions = 0;

	public static void main(String[] args) {
		long num1 = 0;
		long num2 = 1;
		long tmp;
		solution(num1);
		solution(num2);
		while(true) {
			tmp = num1 + num2;
			solution(tmp);
			num1 = num2;
			num2 = tmp;
		}
	}

	static void solution(long number) {
		solutions++;
		System.out.println(solutions + "t" + number);
	}
}

The output of this program overflows after the 93rd term. Java only uses signed integer types, so we end up with a negative number:

1	0
2	1
3	1
4	2
5	3
6	5
....
92	4660046610375530309
93	7540113804746346429
94	-6246583658587674878

To find larger values, we can use the BigInteger class, under java.math. They are constructed like this:

BigInteger num1 = new BigInteger("0");

They don’t really work like the primitive types. They are manipulated with methods like .add() and .subtract(), rather than the operators that we are accustomed to: + - * / %.

The below code has been written with BigInteger instead of long. It will return the nth term of the Fibonacci sequence, or enumerate solutions until the cows come home.

import java.math.BigInteger;

class FibonacciBigInteger {
	static long solutions = 0;
	static long find = 0;

	public static void main(String[] args) {
		if(args.length >= 1) {
			find = Long.parseLong(args[0]);
		}
		BigInteger num1 = new BigInteger("0");
		BigInteger num2 = new BigInteger("1");
		BigInteger tmp;
		solution(num1);
		solution(num2);

		while(true) {
			tmp = num1.add(num2);
			solution(tmp);
			num1 = num2;
			num2 = tmp;
		}
	}

	static void solution(BigInteger number) {
		solutions++;
		if(find == 0 || solutions == find) {
			System.out.println(number.toString());
			if(find != 0) { System.exit(0); }
		}
	}
}

The good news is that this program is now capable of working with numbers too large to say aloud. So today is a good day to start using Unix’s wc program to appreciate how big these numbers actually are:

mike~$ java FibonacciBigInteger 420000 | wc -m
87776

Subtract one character because there is an end-of-line in the output— and this means that the 420,000th Fibonacci number is 8775 digits long. We have discovered a new fact!

Note: There is still one variable in the program above that will overflow eventually, so you should also convert that to a BigInteger if you are hoping to calculate past about the nine quintillionth Fibonacci number (which you would be crazy to do with this method).

Finding prime numbers

It’s not so hard to calculate small-ish primes. The below method is called a prime sieve, and can be done manually:

Take two, the smallest prime number.
We know that no multiples of two are prime (4, 6, 8, 10, 12 etc), so we can cross them off our list and just print two.
Then we go to 3. 3 is prime. But 6, 9, 12 and 15 can’t be, so we print 3 and cross those multiples off.
4 has been marked as not prime, so we skip it.
5 is prime, so we cross off 10, 15, 20.
so on and so forth

Computers are much better at this than humans though, so I wrote this algorithm up in java. You can find all primes up to n fairly quickly (for small values of n).

class Primes {
	public static void main(String[] args) {
		int searchMax = 16777216;
		if(args.length == 1) {
			/* Allow over-riding the maximum */
			searchMax = Integer.parseInt(args[0]);
		}
		boolean[] isprime = new boolean[searchMax];

		int i, tmp, count = 0;

		/* Initialise */
		isprime[0] = isprime[1] = false;
		for(i = 2; i < searchMax; i++) {
			isprime[i] = true;
		}

		/* Main loop */
		for(i = 2; i < searchMax; i++) {
			if(isprime[i]) {
				System.out.println(i);
				count++;
				tmp = i * 2;
				while(tmp < searchMax) {
					isprime[tmp] = false;
					tmp += i;
				}
			}
		}
		System.err.println("Found " + count + " primes between 1 and " + searchMax + " non-inclusive");
	}
}

Of course, the limitation of this is that java isn’t terribly good at handling larger arrays, so I wouldn’t expect more than a few million primes from this code.

Update: The above code was tested for finding the first 7 million prime numbers.

Converting Numbers To Words in PHP

This is a straightforward coding task. I’m working on some maths code in PHP, and need a function to output “twenty-five” for 25, “fifteen” for 15, etc. A quick google search pulled up a neat little PEAR package which can do this.

The results weren’t as flash as I’d hoped though. We ended up with this:

894: eight hundred ninety-four

So it turns out that the PEAR class doesn’t print commas or the word ‘and’ in its numbers. We will be feeding our numbers to festival, and also using them for maths questions. Those pesky ands and commas are a must for this project, so this is not good enough:

9539: nine thousand five hundred thirty-nine

Instead, I need:

9539: nine thousand, five hundred and thirty-nine

I found some commented out code, and tried my own modifications, but it wasn’t working right, so I scrapped the PEAR class and started from scratch, using Wikipedia to populate the lists:

The results were perfect. It took about 300 lines to replace the class, and it handles ordinal numbers too. (‘1st’ = ‘first’, ‘100th = one hundredth’, etc). I ditched the currency feature.

To download the replacement class, click here.

It’s simple to use, just express ridiculous numbers or long decimals as strings to avoid errors. See this example for features:

<?php
include("Numbers_Words.php");
 // one
echo Numbers_Words::toWords(1); newline();

 // two
echo Numbers_Words::toWords(2); newline();

 // twenty-five
echo Numbers_Words::toWords(25); newline();

 // one thousand
echo Numbers_Words::toWords(1000); newline();

 // one thousand and one
echo Numbers_Words::toWords(1001); newline();

 // one hundred thousand and one
echo Numbers_Words::toWords(100001); newline();

 // one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine
echo Numbers_Words::toWords("123 456 789"); newline();

 // thirty-six point nine seven
echo Numbers_Words::toWords(36.97); newline();

 /* nine novemvigintillion, eight hundred and seventy-two octovigintillion, three hundred and fourty-eight septemvigintillion,
	nine hundred and seventy-two sesvigintillion, four hundred and ninety-two quinquavigintillion,
	three hundred and eighty-four quattuorvigintillion, nine hundred and two tresvigintillion,
	three hundred and eighty-four duovigintillion, two hundred and ninety unvigintillion, three hundred and eighty-four vigintillion,
	two hundred and ninety novemdecillion, three hundred and fourty-two octodecillion, five hundred and sixty-three septendecillion,
	four hundred and seventy-five sexdecillion, six hundred and thirty-four quindecillion, eight hundred and fifty-seven quattuordecillion,
	four hundred and fifty-seven tredecillion, three hundred and fourty-nine duodecillion, eight hundred and fifty-seven undecillion,
	two hundred and thirty-four decillion, five hundred and twenty-three nonillion, five hundred and thirty-four octillion,
	eight hundred and fifty-three septillion, two hundred and ninety sextillion, four hundred and seventy-eight quintillion,
	two hundred and ninety quadrillion, three hundred and fourty-seven trillion, two hundred and thirty-eight billion,
	nine hundred and fourty-six million, five hundred and thirty-eight thousand, four hundred and seventy-six */
echo Numbers_Words::toWords("9872348972492384902384290384290342563475634857457349857234523534853290478290347238946538476"); newline();

 // seventeenth
echo Numbers_Words::toWords("17th"); newline();

 // eight hundred and sixty-third
echo Numbers_Words::toWords("863rd"); newline();

 // negative seventy-eight point four
echo Numbers_Words::toWords("-78.4"); newline();

function newline() {
	echo "<br />n";
}
?>

Using correct strings makes synthetic voices much less annoying, and nobody can complain about bad maths questions. :)

Loading OEIS integer sequences

To show that interpreted languages can be fast when used well, I’m posting this example.

Take the On-line Encyclopedia of Integer Sequences database, which is a collection of integer sequences. That means lists of numbers. You can get a file from http://oeis.org/stripped.gz, which contains the first few numbers of each sequence. Today’s file extracts to about 38MB.

Now I need to do lookups in this file for a program I’m writing, and that program is in PHP. We want to know some sequences based on their A-number, like this:

$primes = oeis("A000040");
$fibonacci = oeis("A000045");

If we’re smart about it, then even a large file can be parsed in fractions of a second. Here’s how we do it:

/* This code needs an extract of the OEIS database to operate.
	I got it from http://oeis.org/stripped.gz
	Just extract that to this folder for lookups */
function oeis($number) {
	/* Return an array of values based on a sequence's OEIS number */
	$number = strtoupper($number);
	$fp = fopen("stripped", "r");
	while($ln = fgets($fp)) { /* Find this sequence and break the loop */
		if(substr($ln, 0, strlen($number) + 1) == $number." ") {
			$res = $ln;
			break;
		}
	}
	fclose($fp);
	/* Exit if we haven't got anything */
	if(!isset($res)) { return false; }
	$rv = explode(" ", $ln); /* Split lines into left and right of space */
	$ln = trim($rv[1]);
	$ln = substr($ln, 1, strlen($ln)-2); /* Slices off extra commas on sides */
	$rv = explode(",", $ln); /* Split by commas */
	return $rv;
}

Note that we don’t use explode() until after we have found the line we need, and also note that file_get_contents() is not used at all. (Multi-megabyte strings will bog you down in any language).