Moving Forward

Homepage of Andrew Robinson

Archive for the ‘Uncategorized’ Category

Even PayPal doesn’t Handle Edge Cases Properly

without comments

While working on web-based systems one of the most frustrating aspects of development is catching all the edge cases. It’s not enough to simply anticipate the common case, or design around the expected use case. In the wild world of the web user input is messy and dirty and requires validation, and the code must have defined return paths for all possible error conditions. On the backend of the system if the database server goes down, your application should fail gracefully with an appropriate error message to the end-user, while trying to reestablish a connection to the server appropriately, and monitoring machinery should work to notify the administrator of the system outage, and perhaps take preliminary actions to try and restore service. On the frontend of the system, no matter what user input is received, it should be validated to a standard of what should be expected for that field, escaped properly to avoid any cross-scripting or code injection vulnerabilities, and formatted in a uniform manor for storage and display.

To this end, a great deal of the code written will be dealing with edge cases, and in my opinion one of the best measures of the maturity of a web-based development effort is how well it handles odd or malformed input. One of the best examples of a situation where this is especially evident is how well a service handles e-mail addresses. RFC 2822 gives the formal definition for what is allowed in an e-mail address, and the brave have even implemented this described standard in a machine-readable fashion, creating a beautiful, legible regular expression:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

Fortunately most e-mail addresses only exist in a subset of this standard, containing mostly alphanumeric identifiers, with the occasional period thrown in, and many services limit e-mail addresses to this subset. For many e-mail services there’s also a feature that allows for sub-addressing of an e-mail address with a plus or hyphen operator. The most widely used case of this is GMail allowing for a plus sign and any identifier to be appended to your e-mail address. For example if my address was joe.smith@gmail.com I could also use joe.smith+work@gmail.com to direct e-mail to my inbox, and optionally do some filtering based on the To address field, using that tag.

The plus sign is an interesting case that’s often handled poorly. I’ve seen behaviors ranging from rejecting the perfectly valid e-mail address (which is almost justifiable- users aliasing their address with a sub-address are more than most likely trying to create multiple accounts), to allowing the account to be created, but never allowing one to log into the account. My most recent experience with this odd behavior has come from PayPal.

I don’t typically consider PayPal a company that is on top of things technologically speaking, their service is slow with requests often timing out, and their APIs are terrible (see Stripe for an example of payment APIs done right), but I generally expect above all else for their services to be correct in their behavior.

For unknown reasons my previous PayPal account was placed on hold pending confirmation of my address. This confirmation process can only be done via home phone or by shipping some sort of post card to my residence. I don’t really care enough to complete this process, I’d rather just register a new account, almost every bank account and credit card attached to the previous account has expired or been closed, so I decided to take a shot in the dark and try the e-mail subaddressing trick, and swiftly added a ‘+1′ to my e-mail account and registered again. Surprisingly PayPal allowed this sketchy behavior, and merrily sent me on my way. However, when they sent me the activation e-mail things didn’t go quite so well:

It turns out that the activation link contains the e-mail address I registered with as part of the query string. It’s standard practice when parsing a query string to substitute plus signs with spaces, which PayPal does appropriately, and in turn breaks the URL.

Obviously this isn’t a huge security flaw, or really much more than an inconvenience, but it’s a little odd to see a company who’s business is performing secure transactions not even escape a query string correctly.

Written by Andrew Robinson

April 1st, 2012 at 7:23 pm

Posted in Uncategorized

A New Term: Goldberg (adj)

without comments

I’ve been reading a lot of academic systems papers lately, and a lot of them are a little frustrating. I think that in this world there’s only so many ways one can implement a task scheduler before the law of diminishing returns takes over and the effort spent improving the system starts to look a little silly. Some of the papers seem as if they have been published simply due to pressure to publish.

Realizing this, and reading about the history and origin of some programming languages in article I found, Research in Programming Languages, I’ve become a little frustrated with these papers. As a good example, half of the languages we use today were designed in a few days by someone outside of the academic world, yet power the majority of the modern web. I’ve read a couple papers about scheduling algorithms, that promise great advances over the current threading mechanisms we use, but despite being published years ago we still use the same, simple threading mechanisms. I just feel like a lot of these system designs are unnecessary and needly complex, for the sake of complexity itself, while not realizing significant performance increases. In this spirit I hereby christen a new adjective to describe some of these systems, in honor of their ridiculous complexity, which is surpassed only by their irrelevancy.

goldberg (adj.)

  1. describing a system that has become so
    large and poorly designed that it can be
    likened to a Rube Goldberg machine, where
    a large number of arbitrary tasks are performed
    in sequential order to achieve an otherwise
    simple goal in a convoluted, unintuitive, and unnecessarily
    complex manor.

Synonyms – academic, research, thesis-work

Written by Andrew Robinson

March 6th, 2012 at 8:13 pm

Posted in Uncategorized

Speech Recognition using Sphinx : Don’t Try This At Home

without comments

As part of my ongoing research I needed a quick and easy way to recognize speech. After seeing how effortless products like Siri are at recognition, I naively thought that the technology has been developing nicely, and I was a few short clicks away from glorious, well-supported recognition with moderate accuracy. The reality of the situation was not quite this. Carnegie Mellon current puts out the best open-source speech recognition toolkit, CMUSphinx. It’s great, but poorly documented for the beginning. When I visited the page I had one task I wanted to accomplish: Recognize arbitrary English quickly, preferably from within a language like Python. While this is certainly possible with Sphinx, it’s not intuitive.

So many options.. which one to choose?

By far the hardest aspect of using Sphinx is installing it. It seems the authors, in an effort to cut down on support requests, have actively tried to make it unintuitive.

We must install Sphinx, but which one? On the downloads page the maintainer helpfully points out that it’s tough to know which package to install, we have a good half-dozen available to us, from SphinxBase, to Sphinx1-3, written in C, to Sphinx4, which has been rewritten in Java, to PocketSphinx, which seems as if it’s designed for a mobile platform.

Which one of these to install is not obvious. At first Sphinx4 seems like the obvious choice, but because it’s written in Java, and relatively new it has no language bindings for Python, and seems very beta-ish.

Looking back, Sphinx3 was written in C, and seems decent, so I tried that next. No dice. It’s a mess, reading along there’s a blurb hidden in a wiki page somewhere noting that it’s for research use only.

Finally, after reading an obscure forum post somewhere it was mentioned that PocketSphinx is actually intended for desktop usage too, and has Python bindings! This makes a lot of sense. After face-palming myself for missing the connection, made obvious by the title, I decided PocketSphinx was the application I needed to install!

Luckily for us, Ubuntu has packages available. Pulling out my apt-get shotgun, a quick command installed everything I needed (and more).

sudo apt-get install sphinx*

Actually Doing Recognition

After installing things, life started looking up. Throwing together a quick Python script, using the documentation found here, buried in the CMUSphinx labyrinth actually wasn’t too difficult.

You’ll need a test audio file. Raw 16-bit audio, formatted as a binary stream of unsigned integers works really well. A freely available utility called sox comes with Ubuntu and will help you convert almost anything into raw audio. I’d also suggest looking into Python Audio Tools for on the fly conversions, however don’t try to use PCMConverter, it’s a pile of garbage.

Just open up a raw binary audio file, and invoke the decoder:

import audiotools as at

hmmd = '/usr/share/pocketsphinx/model/hmm/wsj1'
lmd = '/usr/share/pocketsphinx/model/lm/wsj/wlist5o.3e-7.vp.tg.lm.DMP'
dictd = '/usr/share/pocketsphinx/model/lm/wsj/wlist5o.dic'

fRaw1 = open('tmp1.raw', 'r')

speechRec = ps.Decoder(hmm = hmmd, lm = lmd, dict = dictd)

speechRec.decode_raw(fRaw1)
result = speechRec.get_hyp()

print result[0]

hmmd,lmd, and dictd are files used by the Decoder to give it the sense of the language necessary to decode words. By default PocketSphinx comes with a corpus of general text that works alright. If you’re using Sphinx for domain-specific work I’d highly recommend creating your own dictionary with a limited number of words, you’ll achieve much greater accuracy that way.

And we’re done!

So hopefully by now if you followed these steps loosely you’ll have a working speech recognizer. Playing around with my own voice, I’ve found the accuracy to be alright, but not great. Training it to your voice apparently yields better results. From what I’ve read commercial recognizers are using slightly more advanced algorithms than what Sphinx currently uses, and more community time is needed to bring open-source recognition up to speed with something like Siri.

Written by Andrew Robinson

February 29th, 2012 at 2:47 am

Posted in Uncategorized

Reading AAC Encoded Audio in Python

without comments

Using the freely available Python Audio Tools decoding an audio stream is pretty simple. Their site doesn’t have a solid example of using the APIs available, so I’ve written a short demonstration to decode a AAC-encoded audio file. The file I’m decoding was generated using the voice recording application on an iPod touch, although this should work for almost any audio file supported by Python Audio Tools.

import audiotools as at
print 'Opening input data auio stream... decoding'
#Create a AudioFile object out of an input file
aF = at.open('inputData.m4a')
pcmAf = aF.to_pcm()

# We'll store the data in a list, although this algorithm is suitable for
# passing the data to a second stage for online processing.
rawData = []

while True:
    # This file is setup with 2 audio channels, sampled at 44.1kHz
    # we'll read 256 bytes of raw data at a time,
    # or 256 / 2 channels / 2 bytes per sample = 64 frames
    # Since our data is only mono, we discard one of the channels.
    frame = pcmAf.read(256)
    for i in range(0, frame.frames):
        byteArray = frame.channel(0).frame(i).to_bytes(True, True)
        pcmVal = struct.unpack('h', byteArray)
        rawData.append(pcmVal)
    if frame.frames < 64:
        # Smaller frame numbers indicate the end of file has
        # been reached.
        print 'End of file found. Breaking.'
        break

Make sure you’ve installed Python Audio Tools first, it’s freely available from the project page. So far I’ve only had success using these libraries in Linux, Mac OS support seems doable but would take some effort to properly do.

Written by Andrew Robinson

February 27th, 2012 at 1:29 am

Posted in Uncategorized

Generating SVN Statistics

without comments

Recently I became very interested in generating some statistics from a SVN repo. In our research group we have a repository for all the currently in progress papers, which are written in LaTeX, and doing some rudimentary reporting on the number of committed lines by author sounded like a fun way to gamify the process of writing. You can see below one of the highlights of this reporting. As would be expected by a graduate student research lab, a large number of commits happen late in the night, with a large void during working business hours.

I found a great tool to generate some statistics from SVN repos, appropriately called StatSVN. It’s decent out of the box, but lacked some customizability, and automation.

The way it works by default is you invoke it as shown below, and it uses a generated output file from SVN, along with the path to a checked out local repo, to generate a pile of HTML reports and figures tallying various commit statistics. It automatically invokes subversion, and requests the diffs between commits, storing data in a local cache file.

java -jar statsvn.jar papers/logfile.log papers -include "**/*.tex" -config-file config.txt

This works pretty well, but to really create some fun statistics we need to work a little harder. I wanted to filter out some of the larger bulk-commits that don’t accurately reflect actual work, and I wanted to customize the generated report. Naturally I fired up vim and started writing some Python…

Filtering Out Certain Revisions

The first problem was that this repository is pretty new, and a lot of the first commits involved setting up templates and doing other administrative tasks. I want to collect statistics on who produced the most content, not who can push the metaphorical broom hardest in cleaning up templates and moving directories around, so I needed a method to filter out certain commits. The way StatSVN works is by first parsing an exported svn log file, containing a list of commits. What I found is that by simply removing the associated log entry for a commit StatSVN will simply ignore it.

A Sample Log Entry from the SVN Log

<logentry revision="172">
<author>androbin</author>
<date>2012-02-15T19:06:10.225746Z</date>
<paths>
<path kind="file" action="M">/papers/mobicom12-audio/tex/design.tex</path>
</paths>
<msg>Fixed broken paper by updating design.tex</msg>
</logentry>

Python Code to Perform an Update and Generate the Log

print 'Updating SVN repo'
os.system('cd papers; svn up')

print 'Running XML export from SVN repo'
os.system('cd papers; svn log -v --xml > logfile.log')

Before removing it, we update the repository, which I’ve checked out into a directory called papers/, and generate a fresh log file. Next using lxml we load the log file, and an exclude list, and perform the deletion.

Removing Revisions from Statistics based on Number

listToExclude = []
with open('exclude-list.txt', 'r') as f:
    listToExclude = map(lambda x: x.strip(), f.readlines())

print 'Exclude list: ' ,
print listToExclude 

doc = le.parse('papers/logfile.log')
elementsToRemove = []
for pat in listToExclude:
    for elt in doc.findall('logentry[@revision=\'' + pat + '\']'):
        print 'Removing element...'
        elt.getparent().remove(elt)

print 'Writing fille back to disk...'
with open('papers/logfile.log', 'w') as f:
    f.write(le.tostring(doc))

exclude-list.txt simply consists of revision numbers, separate by newlines.

After we’ve modified the logfile we invoke the statistics generation program manually.

Invoking StatSVN

print 'Invoking graph generation software...'
os.system('java -jar statsvn.jar papers/logfile.log papers -include "**/*.tex" -config-file config.txt')

Of interest here is the fact that we’ve passed it a configuration file. I’ve identified three key graphs I’d like to include in my final repo, and resizing them to appropriately fit in the spaces I’ve allocated for them is a little challenging, so I’ve used StatSVN’s ability to specify a config file to resize them and pump up the plot lineStroke to be a little more readable.

StatSVN Config File

chart.loc_per_author.lineStroke=4
chart.loc_per_author.width=600
chart.loc_per_author.height=300

chart.activity_time.width=600
chart.activity_day.width=600
chart.activity_time.height=370
chart.activity_day.height=408

Making an Aggregate Report

So now I’ve filtered out all the commits I don’t care about, but I’m not that happy with the default reports. My goal is to load these stats on a display-case monitor, and none of the default reports are attractive enough, or contain the right information, to make the cut. The approach I decided to take here was to use BeautifulSoup to extract the information I wanted from each of the reports, and then composite it into one report using a template file. This works really well in practice, since the report software’s format won’t change BeautifulSoup has no problems selecting the elements of interest.

HTML Template for the Final Report

<html>
<head>
<title>Group Dangerzone Paper Log</title>
<link rel="stylesheet" href="ocss.css" type="text/css">
</head>

<body>
<h1>Dangerzone Paper Commit Log</h1>
<table width="100%">
<tr>
    <td valign="top" width="70%">
    <table width="100%">
        <tr>
            <td valign="top">[A]</td>
            <td align="right"><img src="loc_per_author.png" /></td>
        </tr>
    </table>
    <br><br><br>
    <table width="100%">
        <tr>
            <td valign="top"><img src="activity_time.png" /></td>
            <td><img src="activity_day.png" /></td>
        </tr>
    </table>
    <h2>Commit Message Tag Cloud</h2>
    [T]
    </td>
    <td>
        [C]
    </td>
</tr>
</table>
</body>
</html>

In the template shown above we use placeholders [C], [T], and [A] for the commit log, tag cloud, and list of author contribution by percentage respectively. Below the python script will extract those elements from the generated reports, and push them into the template, before writing it to output.html.

Making a Pretty Report


print 'Generating output HTML...'

def getSoup(fileName):
    with open(filename, 'r') as f:
        return BeautifulSoup(f.read())

template = ''
with open('template.html', 'r') as f:
    template = f.read()

developers = getSoup('developers.html')
index = getSoup('index.html')
clog = getSoup('commitlog.html')

authorTable = developers.html.body.table
template = template.replace('[A]', str(authorTable))

tagCloud = index.html.body.findAll('div')[2].p
template = template.replace('[T]', str(tagCloud))

commitList = clog.html.body.findAll('dl')[1]
for i in range(24,len(commitList.contents)):
    commitList.contents[len(commitList.contents) - 1].extract()
template = template.replace('[C]', str(commitList))

with open('output.html', 'w') as f:
    f.write(template)

The End Result

This whole script is saved in a file, set to run with a cron job every half-hour, and a line is added to the template file to cause the browser to refresh the page every so often. The finished product is shown below.

Written by Andrew Robinson

February 16th, 2012 at 6:10 am

Posted in Uncategorized

Effectively Running a Ubuntu VM on a Macbook

without comments

I use a Macbook Pro for my day to day tasks, but quite frequently find myself needing a Linux environment to compile and test code. Like many others I use VMware Fusion to run a virtual machine, booting the latest version of Ubuntu. When I started this practice I was worried about the battery life of my Macbook decreasing significantly, so I started looking for some hacks to make things a little more efficient, and just to make my life a little easier.

Disable X.org and Use SSH Exclusively

This step netted me the biggest single improvement in battery life. Since I almost exclusively SSH into my VM it doesn’t make sense to keep the login prompt running. To kill the desktop manager, light in new versions of Ubuntu, simply adding ‘manual’ to the end of the upstart config file will work. Upstart is a relatively recent innovation in the world of Linux, replacing the traditional rc.d startup procedure and properly parallelizing the startup of independent services in Linux.

echo manual >> /etc/init/lightdm

Assign a Static NAT IP Address

For network settings I’ve found that the best possible configuration is using the NAT routing, and statically assigning an IP address. Many corporate networks (and coffee shops!) won’t allow your computer to obtain two DHCP leases, so running NAT is the easiest solution to keep things working no matter where you are.

To tell the VMWare DHCP server to assign a static address we modify /Library/Application Support/VMware Fusion/vmnet8/dhcpd.conf.

Add a line corresponding to the name of your VM to statically assign an address. In this case my VM is simply named Ubuntu.

# Assign a static IP to ubuntu VM
host Ubuntu {
	hardware ethernet 00:0c:29:49:d1:2c;
	fixed-address 192.168.39.5;
}

You’ll also need to record the MAC address given to your VM, easily available using ipconfig. After setting up a static IP address I add an entry to my /etc/hosts file to give my VM an easy hostname for SSHing and testing network services.

192.168.39.5    vm-ubuntu

Setup Key-Based SSH Authenciation

This one is mostly for convenience, but I’d recommend setting up public-key login for SSH. It’s tedious to have to repeatedly type your passphrase when logging into your VM. This is really super easy, so I’d recommend doing it.

Generate a DSA keypair if you haven’t already (make SURE you use a password)

ssh-keygen -t dsa

Copy the DSA public key to your Ubuntu Machine

scp ~/.ssh/id_dsa.pub vm-ubuntu:.ssh/authorized_keys2

This guide has more details, setting up SSH keys has been covered a million times, so I won’t elaborate much further.

Kill the VMwave Window Once Booted Up

Previous versions of VMwave Fusion supported this as an additional menu item that could be enabled, but with later versions the feature has been removed. Once your Ubuntu VM is up and running there’s not much of a reason to keep the window in your dock, so I’d recommend killing the viewer process.

The easiest way to do this is Control+Option right click on the VMware icon in your dock and goto Force Quit. The VM will happily continue running in the background, even without the window.

At this point to further conserve battery life, I’d recommend installing gfxCardStatus and forcing your Macbook into integrated mode, unless you need discrete mode for 3D graphics and rendering. The vmx-server process will force it into discrete mode, as will a number of other proceses (I’m looking at you, Google Chrome), and forcing integrated mode will easily net another 20-30 minutes of runtime.

Following these steps, I can achieve decent battery life with my Macbook Pro. Typically when I’m just reading books in Kindle Reader, or doing light web browsing I’ll see 4-5 hour battery life, with it quickly dropping down to 2-3 hours once I start doing heavy work again such as compiling and working on test cases.

Written by Andrew Robinson

January 4th, 2012 at 3:08 am

Posted in Uncategorized

New Years Resolutions

without comments

It’s that time of year again. At the beginning of every year I like to set a few goals for myself in the programming world, some would call them new years resolutions. Last year I set out to read a few of the classical books in computer science such as The Mythical Man Month by Fred Brooks, Code Complete by Steve McConnell, and Showstopper, the book about David Cutler and the Windows NT kernel development saga. I also wanted to further my understanding of algorithms by finishing up the latter chapters of my favorite algorithms book, known to many as the CLR. Finally I wanted to get a firm handle on a new language or two.

Luckily I managed to accomplish all these goals and more during 2011. I picked up some really solid experience in both Python and Java during the year, implementing some large projects using both languages, read a lot of core computer science books, and wrote a lot of code. I ventured into solving non-trivial problems with the fundmental concepts behind current AI research, and learned a fair deal about networking by exploring the 6LoWPAN stack.

With all that under my belt, I’d like to outline my list of goals for the new year.

  • Learn and Practice Common Design Patterns – I want to ensure my code has a structure to it that makes it easy to reuse, extend, and understand. Design patterns represent solutions that have withstood the test of time, and are a familiar language to all developers. While I’m familiar with the usage of common patterns such as factories and listeners, I’d like to expand outwards and practice the implementation of more diverse patterns.
  • Become Competent in Functional Programming – Many believe that in order to effective make use of parallel computing architectures we will need to rely more and more on functional programming. This style of programming also has benefits when it comes to proving correctness, and has claimed to reduce the number of lines of code required to achieve a solution. Towards the end of the year I started working with Haskell, I’d like to implement at least one large project in a functional language this year, and examine how they can be applied to embedded systems, where most operations are typically IO related and many resource constraints exist.
  • Further my Understanding of Compilers – Fundamental to programming are the tools we use to translate code into machine instructions. I’m planning on developing my understanding of compilers further during the year
  • Publish More Often – While I produced a fair amount of content in 2011, I’d like to ensure that I give even more back. There’s so much in the world of computer science to write about and discover. In 2012 I hope to finish off more write-ups and go the last mile often required to turn exploratory research into publications- both blog posts and research papers

Written by Andrew Robinson

January 2nd, 2012 at 6:12 pm

Posted in Uncategorized

Hacking the Keurig B40 Coffee Maker – Part 3 – Conclusion

with 2 comments

It works! The system is reasonably secure, and certainly fun to use.

Be sure to check out part 1 and 2 to see details of how the system was built:
Part One – The Hardware
Part Two – The Software

Future improvements could include a better GUI and more integration with other services in our lab. The system is written in such a way that it’s reasonably modular, so doing something like tacking a Twitter account on it would be trivial at this point.

Written by Andrew Robinson

December 31st, 2011 at 6:34 am

Posted in Uncategorized

Hacking the Keurig B40 Coffee Maker – Part 2 – Software

with 3 comments

Continuing our long journey, the next step in hacking the Keurig B40 is writing the software that will drive this secure coffee maker of the future. I’ll start with the Arduino code, and build our way up the stack.

Writing the Arduino code required overcoming a few annoyances. I needed to simulate a user-button press, which is an event that typically lasts for a few hundred milliseconds, and detect if the system was ready to brew, which isn’t indicated by a single LED, but rather the blinking of an LED. These are pretty simple challenges to overcome.

Sending Compact Status Data

The Arduino code uses a bit-struct to pack the values, which will be unpacked by Python on the host side of things. Here’s what that bit structure looks like:

union send_t {
  struct {
   byte  isOn : 1;
   byte  isEightOzButtonPressed : 1;
   byte  isTenOzButtonPressed : 1;
   byte  isWarmedUp : 1;
   byte  isNeedWater : 1;
   byte  isReadyToBrew : 1;
  };
  byte value;
};

This is one of my favorite bit tricks in C. Unions can be used all over the place to avoid unnecessary casting. In this case, I’ve use a bit-packed anonymous struct, which are really useful when writing types to set bit-masks, and unioned it with a byte, to allow us to get and set the entire raw value at once. While we could of used bit-shifting to get the same result, this has the advantage of being 100% readable.

Reading Coffee Maker Status

Next we create a function to handle reading the inputs, and outputting this data over the UART channel:

send_t toSend;

int outputCounter = 0;

// Counts toggles, > 3 = READY TO BREW.
int toggleCounter = 0;

// We could use outputCounter to keep track of this value,
// but breaking it out into a separate variable makes it
// a little clearer what exactly is going on.
int toggleTimeCounter = 0;

// The last value, used to determine toggle.
byte lastValue = 0;

void handleOutput() {
  // We want to catch these events no matter what, so we simply
  // AND them with every poll, that way if a button press was detected
  // at any point it's recorded.
  toSend.isEightOzButtonPressed =
    toSend.isEightOzButtonPressed | ((PIND >> 4) & 1);
  toSend.isTenOzButtonPressed =
    toSend.isTenOzButtonPressed | ((PIND >> 5) & 1);

  // We've detected a toggle.
  if(lastValue != ((PIND >> 6) & 1)) {
    // If it's been 500ms - 550ms it's probably the light blinking
    // to symbolize that brewing is ready.
    toggleCounter++;
    toggleTimeCounter = 0;
  }

  // If we've been waiting more than 600ms for a toggle let's give up.
  if(toggleTimeCounter > 600) {
     toggleCounter = 0;
  }

  lastValue = ((PIND >> 6) & 1);

  if(outputCounter % 500 == 0) {
    toSend.isOn = ((PIND >> 3) & 1) ^ 1;
    toSend.isNeedWater = ((PIND >> 7) & 1) ^ 1;
    if(toggleCounter > 3) {
      toSend.isReadyToBrew = 1;
    }

    toSend.isWarmedUp = (lastValue == 0) || toSend.isReadyToBrew || toggleCounter > 0;
    Serial.write(toSend.value);
    toSend.value = 0;
  }
  outputCounter++;
  toggleTimeCounter++;
}

I’ve gone with an event-loop style loop here, so everything happens on counter intervals. In this block we measure the time between transitions of the brew button, read the state of the buttons on the front to record any user actions, and finally read the other fairly static parameters and send them along the wire.

I read user input in a sticky manor, every 500ms a UART packet is generated and if the user pressed one of the brew buttons at any point in time it will record it for that frame.

In this code I don’t use the digitalRead and Write functions. I don’t feel that the thin wrappers the Arduino code base provides are really that useful. Having direct access to the ports on the AVR really isn’t a grand leap.

Handling Inbound Commands

Left to do is handle commands sent from the host pc to the Arduino. This is also a pretty simple process.

unsigned int commandCounter = 0;

void handleInput() {
  // Here we handle button presses-
  // We simulate a ~200ms press asyncronously
  // to simulate user input. 

  // If commandCounter == 0 we look for UART data,
  // else we block on this specific command and increment
  // up to 200 (~200ms)

  // If a lot of commands were sent at once this would
  // potentially overflow and drop some commands, but
  // I don't think that's a realistic situation.
  if(commandCounter > 0) {
    commandCounter++;
    if(commandCounter == 200) {
      // Set PORTB into high-z state, since the buttons
      // use an analog resistor network we don't want
      // stray current messing with the PIC on the coffee
      // maker
      DDRB = 0;
      // Disable internal pullup resistors
      PORTB = 0;
      // Reset commandCounter
      commandCounter = 0;
    }
  } else if(Serial.available() > 0) {
    switch(Serial.read()) {
      case 'p':
        // Press the power button (Pin 8)
        PORTB = (1<<0);
        DDRB = (1<<0);
        commandCounter = 1;
      break;
      case 't':
        // Press the 10oz brew button (Pin 9)
        PORTB = (1<<1);
        DDRB = (1<<1);
        commandCounter = 1;
      break;
      case 'e':
        // Press the 8ox brew button (Pin 10)
        PORTB = (1<<2);
        DDRB = (1<<2);
        commandCounter = 1;
      break;
    }
  }
}

We read bytes as they come down the wire, relying on the serial buffer to hold any pending button press commands while the current ones are executing. The code will toggle the button for approximately 200ms.

We’ve designated three commands, ‘p’ to power the unit on or off, ‘t’ to perform a 10oz brew, and ‘e’ to perform a 8oz brew. The commands are single ASCII characters so no delimiting or framing was required.

Of interest is the fact that we deliberately keep all the buttons in a high impedance state. Because the Keurig reads buttons using a resistive network, pushing or pulling stray current into it could easily cause the coffee maker to fail to register button presses. We do this by setting the pins to inputs when they are not being actively driven, and disabling the internal pull ups.

Finally we have the often-found Arduino sketch components, loop and main:

void setup() {
  Serial.begin(9600);
}

void loop() {
  handleInput();
  handleOutput();
  delay(1);
}

That covers the Arduino code, next we hook this up to a Python interface to handle reading status, writing commands, taking a photo of the user, and handling RFID events.

Reading data from the RFID Reader and Coffee Maker

Reading serial data in python is actually surprisingly easy, with the pySerial module it really is as simple as it can get. First we open up two serial ports simultaneously:

rfidSer = serial.Serial('/dev/ttyUSB0', 9600, timeout=1)
coffeeSer = serial.Serial('/dev/ttyUSB1', 9600, timeout=1)

Next we parse data from both the ports.

For the RFID Reader:

	if rfidSer.inWaiting() > 1:
		data = rfidSer.readline().strip()[1:-2]
		if data in authorizedUsers:
			handleUserAuth(authorizedUsers[data])
		else:
			handleAccessDenied()
		time.sleep(1)
		rfidSer.flushInput()

I wait a second and flush the buffer after reading the line. The reader uses ASCII data but puts some start text/stop text characters at the beginning and end that python doesn’t parse naturally. Just clearing the buffer is sufficient to get rid of them.

For reference the RFID reader is the ID-12 sold by Sparkfun, available here.

For the coffee maker

	if coffeeSer.inWaiting() > 0:
		# Read the new status, detect differences, and fire
		# events accordingly.
		incomingByte = ord(coffeeSer.read())
		newCoffeeMakerStatus = {
			'isOn' : ((incomingByte >> 0) & 1) == 1,
			'isEightOzButtonPressed' : ((incomingByte >> 1) & 1) == 1,
			'isTenOzButtonPressed' : ((incomingByte >> 2) & 1) == 1,
			'isWarmedUp' : ((incomingByte >> 3) & 1) == 1,
			'isNeedWater' : ((incomingByte >> 4) & 1) == 1,
			'isReadyToBrew' : ((incomingByte >> 5) & 1) == 1
		}

This simply looks like a reversal of the C++ Arduino code I wrote earlier. I use a Python dict to keep track of the coffee maker state, and then test for differences after receiving each byte.

Taking a Photo

Last, but certainly not least, I wanted the python script to take a photo of the coffee user. Luckily there’s a command line tool available. We simply take a photo, save it to a directory, and append it to a running HTML document.

def takePhoto():
	fileName = '-'.join((str(time.time()), currentUser))
	os.system('streamer -q -o cphotos/%s.jpeg &> /dev/null' % fileName)
	f = open('userListAppend.html', 'w')
	f.write("""<table><tr><td align="center"><img width="330" src='cphotos/""" + fileName + """.jpeg'></td></tr><tr><td><b>User: </b> """ + currentUser + """<br><b>Date:</b> """ + str(datetime.datetime.now()) + """</td></tr></table>""")
	f.close()
	os.system('cat userList.html >>userListAppend.html & cp userListAppend.html userList.html')

That’s pretty much it for the software section of this hack. There’s some glue code here and there I can make available upon request, but all the hard parts are laid out here.

In my implementation I went ahead and integrated our coffee maker with an existing system designed to track sensor events. The coffee maker posts to our event server, and the events are displayed on a large screen outside our lab.

Written by Andrew Robinson

December 31st, 2011 at 6:30 am

Posted in Uncategorized

Hacking the Keurig B40 Coffee Maker – Part 1 – Hardware

with 11 comments


The Keurig B40 coffee maker is a masterpiece of engineering marvel. Inside you will find 2 pumps, 3 solenoids, 4 circuit boards, and an exorbitant number of tubes, sensors, and connectors. As a weekend project I’ve taken one of these beasts apart, documented it, and subsequently modified it to solve the complete lack of physical security and auditing, which I consider a serious flaw in the original design. K-cups are a precious commodity, with costs far exceeding that of traditional bulk-purchased coffee, and their usage must be carefully monitored to control costs. Previously we had no way to attribute usage to individuals as to ensure proportional, fair contributions were made to the coffee fund that accurately reflected individual consumption. In this day and age it is frightening to imagine that such a common device lacks the proper physical security interfaces required to properly solve this problem.

To rectify this situation I added an Arduino hardware interface and RFID scanner to track usage of the machine. We had previously distributed RFID devices to all lab members as part of another project to create a keyless entry system, so using RFID was the natural choice. The original brewing interface (push button on the machine) has been secured by being disconnected, and attached to the Arduino, where input is received and processed by the netbook, which performs authentication, auditing, and authorizes brewing.

Disassembly

WARNING: Proceed at your own risk. A coffee maker, by definition, mixes water and electricity, which has generally been considered a bad thing. This coffee maker, in addition to making delicious, bold blends, could easily kill you or permanently damage whatever you might connect to it.
Read the rest of this entry »

Written by Andrew Robinson

December 27th, 2011 at 5:45 pm

Posted in Uncategorized