I have written previously about how to generate receipts for printers which understand ESC/POS. Today, I thought I would write about the opposite process.
Unlike PostScript, the ESC/POS binary language is not commonly understood by software. I wrote a few utilities last year to help change that, called escpos-tools.
In this post, I’ll step through an example ESC/POS binary file that an escpos-tools
user sent to me, and show how we can turn it back into a usable format. The tools we are using are:
You might need this sort of process if you need to email a copy of your receipts, or to archive them for audit use.
Printing the file
Binary print files are generated from drivers. I can feed this one back to my printer like this:
cat receipt.bin > /dev/usb/lp0
My Epson TM-T20 receipt printer understands ESC/POS, and prints this out:
Installing escpos-tools
escpos-tools
is not packaged yet, so you need git
and composer
(from the PHP eco-system) to use it.
$ git clone https://github.com/receipt-print-hq/escpos-tools
$ cd escpos-tools
$ composer install
Inspecting the file
There is text in the file, so the first thing you should try to do is esc2text
. In this case, which works like this:
$ php esc2text.php receipt.bin
In this case, I got no output, so I switch to -v
to show the commands being found.
$ php esc2text.php receipt.bin -v
[DEBUG] SetRelativeVerticalPrintPositionCmd
[DEBUG] GraphicsDataCmd
[DEBUG] GraphicsDataCmd
[DEBUG] SetRelativeVerticalPrintPositionCmd
...
This indicates that there is no text being sent to the receipt, only images. We know from the print-out that the images contain text, so we need a few more utilities.
Recovering images from the receipt
To extract the images, use escimages
. It runs like this:
$ php escimages.php --file receipt.bin
[ Image 1: 576x56 ]
[ Image 2: 576x56 ]
[ Image 3: 576x56 ]
[ Image 4: 576x56 ]
[ Image 5: 576x56 ]
[ Image 6: 576x56 ]
[ Image 7: 576x56 ]
[ Image 8: 576x52 ]
This gave us 8 narrow images:
Using ImageMagick’s convert
command, these can be combined into one image like this:
convert -append receipt-*.png -density 70 -units PixelsPerInch receipt.png
The result is now the same as what our printer would output:
Recovering text from the receipt
Lastly, tesseract
is an open source OCR engine which can recover text from images. This image is a lossless copy of what we sent to the printer, which is an “easy” input for OCR.
$ tesseract receipt.png -
Estimating resolution as 279
Test Receipt for USB Printer 1
Mar 17, 2018
10:12 PM
Ticket: 01
Item $0,00
Total $0.00
This quality of output is fairly accurate for an untrained reader.
Conclusion
The escpos-tools
family of utilities gives some visibility into the contents of ESC/POS binary files.
If you have a use case which requires working with this type of file, then I would encourage you to consider contributing code or example files to the project, so that the utilities can be improved over time.