For one of our projects at Web Ascender we’ve had the interesting challenge of communicating with low-level hardware from our high-level Ruby on Rails application. The application needed to read binary data from UDP packets, convert the data to Ruby numbers, dates and strings, and then send a response. While one might normally reach for a more low-level language like C or C++ for this, we were pleasantly surprised with the wide array of built-in tools Ruby provides for working with bits and binary data. Here I cover working with different bases, bitwise and shift operators, and packing and unpacking binary data. Note below I am working with the 2.1.1 release.
Working with different bases
For constants or just playing around in IRB, Ruby supports binary and hexadecimal literals through the respective
0xprefixes. In IRB, this also makes for a quick way to get the integer’s decimal form:
>> 0b1010 => 10 >> 0x4a2f81 => 4861825
For programmatic conversion, there’s the
to_i method of the
String class, which attempts to interpret the string as an integer of the given base.
>> '0b1010'.to_i(2) => 10 >> '10111010'.to_i(2) => 186 >> '1af842'.to_i(16) => 1767490
It’s often useful to convert integers back to strings for display. By default, Ruby will show you the decimal form, but hexadecimal is often more ideal. This is where
Fixnum#to_s comes in.
>> 24.to_s(16) => "18" >> 0b101010.to_s(16) => "2a" >> 42.to_s(2) => "101010" >> rand(10**8..10**9).to_s(36) => "fd95me"
As the last example shows, the method supports bases up to 36, making it useful for generating random codes of a given length.
Bitwise and Shift Operators
These will be familiar to anyone who has worked with languages from the C-Family.
They also open up a world of C tricks, like this one, to test an integer’s parity (checks the last bit (lsb) for 1):
>> 1 & 1 # odd => 1 >> 2 & 1 # even => 0
Of course, some tricks are best left to C and here the
odd? method would be much more readable. For us, these turned out particularly helpful when unpacking dates. As an example, dates can be packed into a 16-bit integer like so:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 M M M M D D D D D Y Y Y Y Y Y Y
This allots enough space for all 12 months (4 bits = 16 values), all 31 possible days (5 bits = 32 values) and all years 1900-2028 (7 bits = 128 values). In real life, you’ll naturally want to support a larger year range. See Y2K and Y2038.
April 6th, 2014 could then be represented like this:
0100 00110 1110010 (= 17266) 4 6 114
To extract the day of the month from this, we can use a bitmask and employ the
>> operators, like so:
>> date = 0b0100001101110010 => 17266 >> day_mask = 0b0000111110000000 => 3968 >> (date & day_mask).to_s(2) => "1100000000" >> day = (date & day_mask) >> 7 => 6
Packing and unpacking
Last up, there’s
Array#pack. Although Ruby provides no way to directly access memory or control the way numbers and strings are represented, these two methods are provided to generate and parse binary sequences with that sort of control. In our case, that meant packing and unpacking data sent over a UDP socket.
String#unpack takes a template and decodes a string, which may be binary data, and returns an array. See Ruby’s documentation for the list of template format options—several integer and float sizes, big- or little-endian, are available. In the example below, a string containing three 8-bit integers is decoded:
>> "\xff\x00\x2a".unpack('CCC') => [255, 0, 42] >> "\xff\x00\x2a".unpack('C2') => [255, 0] >> "\xff\x00\x2a".unpack('C*') => [255, 0, 42]
Array#pack, when called with the same template, is the reverse of
String#unpack, useful for generating a binary sequence to send over the wire:
>> [255, 0, 42].pack('C*') => "\xff\x00\x2a"