I’m learning C++ at the moment, and I don’t find long tutorials or studying the standard template library particularly fun.
Making this type of password-generator is not new, but it is a nice practical exercise to start out in any language.
1. Get a list of common English words
Googling “common English words” yielded this list, purporting to contain 5,000 words. Unfortunately it contains almost 1,000 duplicates and numerous non-words! Wiktionary has a much higher-quality list of words compiled from Project Gutenberg, but the markup looks a bit like this:
==== 1 - 1000 ====
===== 1 - 100 =====
[[the]] = 56271872
[[of]] = 33950064
[[and]] = 29944184
[[to]] = 25956096
[[in]] = 17420636
[[I]] = 11764797
Noting the wikilinks surrounding each word, I put together this PHP script to extract the link destinations and called it get-wikilinks.php:
#!/usr/bin/php
<?php
/* Return list of wikilinked words from input text */
$text = explode("[[", file_get_contents("php://stdin"));
foreach($text as $link) {
$rbrace = strpos($link, "]]");
if(!$rbrace === false) {
/* Also escape on [[foo|bar]] links */
$pipe = strpos($link, "|");
if(!$pipe === false && $pipe < $rbrace) {
$rbrace = $pipe;
}
$word = trim(substr($link, 0, $rbrace))."n";
if(strpos($word, "'") === false && !is_numeric(substr($word, 0, 1))) {
/* Leave out words with apostrophes or starting with numbers */
echo $word;
}
}
}
The output of this script is much more workable:
$ chmod +x get-wikilinks.php
$ cat wikt.txt | ./get-wikilinks.php
the
of
and
to
in
I
Using sort and uniq makes a top-notch list of common words, ready for an app to digest:
$ cat wikt.txt | ./get-wikilinks.php | sort | uniq > wordlist.txt
2. Write some C++
There are two problems being solved here:
- Reading a file into memory
- An ifstream is used to access the file, and getline() will return false when EOF has been reached
- Each line is loaded into a vector (roughly the same type of container as an ArrayList in Java), which is resized dynamically and accessed like an array.
- Choosing random numbers
- These are seeded from a random_device, being more cross-platform than reading from a file like /dev/urandom.
- Note that random is new to C++11.
pw.cpp
#include <fstream>
#include <vector>
#include <string>
#include <iostream>
#include <random>
#include <cstdlib>
using namespace std;
int main(int argc, char* argv[]) {
const char* fname = "wordlist.txt";
/* Parse command-line arguments */
int max = 1;
if(argc == 2) {
max = atoi(argv[1]);
}
/* Open word list file */
ifstream input;
input.open(fname);
if(input.fail()) {
cerr << "ERROR: Failed to open " << fname << endl;
}
/* Read to end and load words */
vector<string> wordList;
string line;
while(getline(input, line)) {
wordList.push_back(line);
}
/* Seed from random device */
random_device rd;
default_random_engine gen;
gen.seed(rd());
uniform_int_distribution<int> dist(0, wordList.size() - 1);
/* Output as many passwords as required */
const int pwLen = 4;
int wordId, i, j;
for(i = 0; i < max; i++) {
for(j = 0; j < pwLen; j++) {
cout << wordList[dist(gen)] << ((j != pwLen - 1) ? " " : "");
}
cout << endl;
}
return 0;
}
3. Compile
Lots of projects in compiled languages have a Makefile, so that you can compile them without having to type all the compiler options manually.
Makefiles are a bit heavy to learn properly, but for a project this tiny, something simple is fine:
default:
g++ pw.cpp -o pw -std=c++11
clean:
rm -f pw
Now we can compile and run the generator:
make
./pw
The output looks like this for ./pw 30 ("generate 30 passwords"):