<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Moving Forward</title>
	<atom:link href="http://andrewbrobinson.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://andrewbrobinson.com</link>
	<description>Homepage of Andrew Robinson</description>
	<lastBuildDate>Tue, 01 May 2012 18:42:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Stages of Exception Handling Maturity</title>
		<link>http://andrewbrobinson.com/2012/05/01/stages-of-exception-handling-maturity/</link>
		<comments>http://andrewbrobinson.com/2012/05/01/stages-of-exception-handling-maturity/#comments</comments>
		<pubDate>Tue, 01 May 2012 18:42:49 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=978</guid>
		<description><![CDATA[Exceptions always seemed weird to me. Why would I want to catch an exception, when with proper coding they shouldn&#8217;t occur in the first place? I felt at first like exceptions were designed into the language for someone else to use, and even when I did want to make use of them I found them [...]]]></description>
			<content:encoded><![CDATA[<p>Exceptions always seemed weird to me. Why would I want to catch an exception, when with proper coding they shouldn&#8217;t occur in the first place? I felt at first like exceptions were designed into the language for someone else to use, and even when I did want to make use of them I found them awkward and cumbersome. </p>
<p>After resisting exceptions for a while, I started to come around. It started to become clear that even if code is perfect (which it rarely is) there is still a pile of network and system related errors that can occur. While one way to handle these errors is to add return codes to all functions that happen to touch the outside world, a perhaps cleaner way is to throw an exception. All of the sudden exceptions became cool.</p>
<p>Knowing that exceptions have a time and place in a program, I started designing all kinds of complex exception handling machinery. My code was destine to handle all exceptions with grace and ease, and easily recover. An exception handler would in turn try another body of code that could throw an exception, which would enter a wait state before retrying and catching another exception and so on. It was quite a mess, and the more I developed these techniques, the more I became acutely aware that this wasn&#8217;t really the right way to do things either. Applications simply can&#8217;t be able to recover from every possible scenario. </p>
<p>A few years and a couple of paradigm shifts later I&#8217;m finally starting to see what the point of catching an exception is, why they are useful, and what to do when one occurs. The bottom line is the point of an exception is to limit damage when something unexpected happens in your code, get back to a known state if possible, or abort if not. Exceptions do not make your code bullet-proof, however they do a great job of mending wounds and keeping the system up, despite unexpected occurrences. Nowadays my exceptions follow a pretty rigid two-phase pattern, restoring state and restoring flow control, and this works reasonably well. </p>
<h3>Restoring State</h3>
<p>The most immediate goal of an exception is to return the application to a known state. This involves performing any necessary maintenance on modified data structures, managing any unmanaged resources that need to be released, and deleting any temporary files that had happened to be created during the execution of the routine. </p>
<p>I&#8217;ve been doing a lot of server work lately, and this usually consists of another try block to close any open database connections, and an iteration through a list of temporary files. As certain methods execute in my program that use intermediate files they keep a list of created temporary files, and my error handling routine iterates through this list and deletes any files still in existence.</p>
<p>In other cases I end up deleting items from a <code>HasMap</code> structure, where the value of the entry points to an incompletely initialized object. Either way, after this part of the exception routine is done, the application should be in a state identical to before it entered the try block. </p>
<h3>Restoring Flow Control</h3>
<p>After restoring the application&#8217;s state, the next problem is where to transfer flow control to. I&#8217;ve found that exceptions usually break down into two pretty clean categories. </p>
<p><i>Temporary exceptions</i>  occur when servers go down, or network requests fail. They are often transient, and upon retrying might succeed. For these kinds of exceptions I usually implement some sort of retry loop, with a fixed number of retries allotted based on the timing requirements of the function I&#8217;m in. </p>
<p><i>Permanent exceptions</i> involve events that are not likely to &#8216;fix themselves&#8217; if given time. File permissions issues, logic errors, and most other non-IO exceptions fall into this category. When I encounter this kind of exception the show is basically over. If the exception occurs in a reasonably compartmentalized task, as is often the case, the entire program doesn&#8217;t need to cease execution, we after all did restore it to a known state, but we do need to abort the current task and present a failure message to the user. </p>
<p>When a temporary exception times out after a certain number of retries it too becomes a permanent exception. </p>
<h3>Exception Hierarchy </h3>
<p>In this model it&#8217;s actually really reasonable to throw in a hierarchical approach to exceptions as well. It&#8217;s common in the Java world to catch an exception, only to throw another one. For example a class named <code>MagicMoneyWorker</code> might throw <code>MagicMoneyWorkerException</code> exceptions as a result of network timeouts when performing remote procedure calls. It has wrapped the more low-level exception with a high level one.</p>
<p>The reasons for doing this fit well with the object oriented approach to programming. The <code>MagicMoneyWorker</code> class probably knows best how to handle a network timeout. It knows how to preserve it&#8217;s inner state, and recover, and probably even has some retry and reconnect logic built in to it. By the time it&#8217;s done handling the exception, if an error occurs, it&#8217;s a permanent exception and it means that the task at hand failed. This needs to be handled now by the caller function. </p>
<p>Wrapping exceptions in this manor allows you to keep the primitive exceptions hidden, and preserves the level of abstraction you&#8217;re operating with. If you&#8217;re writing a function that deals with <code>MagicMoneyWorker</code>s on a high level, you don&#8217;t want to have to dive into the bowels of the implementation. It makes much more sense for an exception with a matched level of abstraction to be thrown, and for your function to act on this exception. </p>
<h3>Simply my Personal Approach</h3>
<p>I doubt this approach will work well for all systems or applications, but I&#8217;ve found it works reasonably well for my development lately. Throughout my computer science education there hasn&#8217;t been much focus on how to handle exceptions. I think they are considered more of an engineering problem, because the code we deal with is so algorithmic in nature, and brushed under the rug as something you&#8217;ll pick up over time. Developing a personal framework for how to handle an exception, and giving a rigid procedure, purpose, and ordering to the process has tremendously helped hone my skills in developing real world applications. If you don&#8217;t have a clear idea of how to deal with exceptions in your own projects, I would definitely recommend spending some time to mediate on them and figure out what the goals of catching an exception are for you. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/05/01/stages-of-exception-handling-maturity/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>AudioDAQ: A Low-Power Phone Peripheral Interface</title>
		<link>http://andrewbrobinson.com/2012/04/22/audiodaq-a-low-power-phone-peripheral-interface/</link>
		<comments>http://andrewbrobinson.com/2012/04/22/audiodaq-a-low-power-phone-peripheral-interface/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 06:48:27 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Research]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=952</guid>
		<description><![CDATA[I&#8217;m just finishing up a trip to IPSN&#8217;12 in Beijing this week where I presented a demonstration of my current research project. It&#8217;s a pretty cool piece of work. We wanted to develop a device that plugs into the headset port of a mobile phone and allows for capture of analog signals. To that end [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/audiodaq-final-1024x647.jpg" alt="" title="audiodaq-final" width="600" class="aligncenter size-large wp-image-960" /></p>
<p>I&#8217;m just finishing up a trip to IPSN&#8217;12 in Beijing this week where I presented a demonstration of my current research project. It&#8217;s a pretty cool piece of work. We wanted to develop a device that plugs into the headset port of a mobile phone and allows for capture of analog signals. To that end we developed a hardware and software system called AudioDAQ that does just that. </p>
<p>The system consists of a small square-inch form factor piece of hardware and server side software for processing. It allows for arbitrary analog waveforms to be exported over the microphone line in the audible range, and is powered entirely from the microphone bias voltage. This allows for data to be captured by the built-in voice recording application of the phone, and makes the system compatible with virtually every handset on the market today with no software needed on the phone-side.</p>
<p>Data captured in a voice recording is sent to a remote server, where an algorithm extracts the original signal and produces a plot of the data, along with a comma-separated-value file of the data points.</p>
<p>I&#8217;ll dive into a little bit of the design of this system. I think it&#8217;s a really good example of something that&#8217;s cleverly simple. Despite being fairly simple, we encountered significant design challenges while fine-tuning system parameters to optimize power transfer and data recovery. </p>
<h2>Delivering Power</h2>
<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/power-300x134.jpg" alt="" title="power" width="300" height="134" class="alignleft size-medium wp-image-964" /><br />
The AudioDAQ hardware consists of a small number of analog and digital components, that require small amounts of power to operate. Additionally we wanted to provide enough power for small active sensors to operate. Most transducers only require a small number of active op-amps to operate which, if efficiently designed, will only draw a couple hundred microwatts of power. Because of this the microphone bias voltage was a suitable candidate for powering the system.</p>
<p>The microphone bias voltage is traditionally used to power small amounts of amplification circuitry in modern microphones found in hands-free pieces. It has been found to be around two DC volts on most surveyed handsets, and sits behind a high-impedance resistor that prevents too much current from flowing (R1 in the photo above). Because of this resistor it does not provide too much power, usually on the order of hundreds of microwatts. </p>
<p>We feed the microphone bias voltage into a small ultra low dropout linear regulator to ensure it is a consistent 1.8V. This becomes important because we also use this voltage as a reference voltage for our system. </p>
<h2>Capturing Data</h2>
<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/multiplexer-300x169.jpg" alt="" title="multiplexer" width="300" height="169" class="alignleft size-medium wp-image-963" /><br />
Next we must capture data for recording. The microphone port cannot simply be fed a DC-valued analog signal. It has a high-pass filter, formed by C1 and R2 in the first diagram, that prevents DC values from making their way through the system. To overcome this we go with a simple, effective solution: installing an analog multiplexer to switch between system ground and the signal at a speed within the audible passband. </p>
<p>This multiplexer creates a square wave with an amplitude reflective of the DC value of the original signal. Phone analog front ends all have different characteristic values however, and while the magnitude is proportional to the voltage of the analog sensor signal, it does not have an implicit scale. To fix this we additionally export the reference voltage across the microphone port. By switching between ground, the signal, and a reference voltage we can determine where the signal sits between the reference voltage and ground, and scale accordingly. </p>
<p>To further extend the system we can add multiple channels of inputs as shown in the diagram above. This gives us the ability to simultaneously capture multiple analog signals. </p>
<p>An interesting design challenge was managing the tradeoff between signal fidelity and energy delivery. The amplitude of a microphone signal is approximately 10mV, which is quite small. Adding a linear regulator and a small amount of capacitance will easily drown out that signal by adding additional noise and attenuating it. To negate this we installed an additional resistor between the linear regulator and the microphone line. Sizing it carefully yields a good tradeoff between power delivery, and the isolation of the microphone line from noise from the linear regulator. </p>
<p>Data is captured with the voice recording app on the phone and e-mailed to our server for decoding. Because almost every phone manufactured recently has a built-in voice memo application the AudioDAQ platform is compatible with a large base of existing devices.</p>
<h2>Processing the Data</h2>
<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/restruct-300x150.png" alt="" title="restruct" width="300" height="150" class="alignleft size-medium wp-image-972" /></p>
<p>Finally we must reconstruct the signal. Currently a small piece of Python code does this. The code grabs the multiplexed, encoded audio data, and extracts from it framing information. You can see the intermediate steps of the algorithm in the figure. It first detects the edges, finds a mean value to represent the steps between the edges, measures these values, and finally reconstructs the original signal from them. In practice this works wonderfully, with the original signal being easily recovered. </p>
<h2>In Closing</h2>
<p>AudioDAQ works really well and has been a big focus of my work for the past few months. You can read the published demo paper <a href='http://andrewbrobinson.com/wp-content/uploads/2012/04/paper.pdf'>here</a>. Signal reconstruction obtains good quality results and the technology is fairly well developed. If you have any interest in using AudioDAQ in your projects, feel free to contact me and I&#8217;d be more than happy to send you some schematics and code!</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/04/22/audiodaq-a-low-power-phone-peripheral-interface/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reasoning About Phone TCP Performance</title>
		<link>http://andrewbrobinson.com/2012/04/13/reasoning-about-phone-tcp-performance/</link>
		<comments>http://andrewbrobinson.com/2012/04/13/reasoning-about-phone-tcp-performance/#comments</comments>
		<pubDate>Fri, 13 Apr 2012 14:40:39 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Android]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=907</guid>
		<description><![CDATA[Here&#8217;s a question that has often been asked, but seldom been answered: how does one send an image to or from a phone as efficiently and quickly as possible? More generally, how can we reason about the behavior of TCP/IP as implemented on a modern smart phone? Phones don&#8217;t seem to play by the rules [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a question that has often been asked, but seldom been answered: how does one send an image to or from a phone as efficiently and quickly as possible? More generally, how can we reason about the behavior of TCP/IP as implemented on a modern smart phone? Phones don&#8217;t seem to play by the rules which TCP was designed for, we have a terribly unreliable channel pushing packets at random, with terrible delivery rates. This is exactly the kind of scenario that would wreak havoc with congestion avoidance and control algorithms, when in reality there might not be any valid reason to artificially throttle the connection. Not much work into how TCP performs in the mobile world has gone into commercial devices and I think it&#8217;s a really big open question that deserves some attention. </p>
<p>To that end I&#8217;ve started looking at characterizing TCP delivery on my mobile phone, in the hopes that from this characterization I can develop some more efficient algorithms, and just get an intuitive sense for what can and cannot be offloaded to the server. A big idea that has started gaining some traction lately is heterogeneous computing, where computationally expensive tasks are offloaded from your phone to the cloud for processing. Knowing the abilities of the network data channel will play a big roll in determining the feasibility of this, and many other ideas.</p>
<h3>Let&#8217;s Transfer Some Images</h3>
<p>My current side project is centered around manipulating images on the mobile phone, or in the cloud, so a good starting point in this study was to see how long it takes to upload and download images of various sizes, over HTTP. I used three image sizes:</p>
<ul>
<li><b>Small</b> &#8211; 11KB</li>
<li><b>Medium</b> &#8211; 51KB</li>
<li><b>Large</b> &#8211; 86KB</li>
</ul>
<p>I took these sizes from the three thumbnail sizes Facebook&#8217;s Haystack image store uses to present images to the end user, as those sizes seem to be working reasonably well for them. A large image is by no means a full-sized image, but it&#8217;s more than big enough to display on any smartphone screen, with some zooming supported. </p>
<p>Because this is a first pass, the implementation is pretty rough, and leaves room for optimization. My goal was not to see what the most optimal way to send an image across the wire was, but it was to characterize the most obvious way a typical developer might do it. I used the pre-bundled HTTP classes, and did everything the obvious way. I wrote two functions and wrapped them up in an Android service, with some logging facilities.</p>
<p>The functions themselves looked like this:</p>
<h4>Downloading</h4>
<pre class="brush: java; title: ;">
        private int doDownload(String path) {
            long startTime = System.currentTimeMillis();

            try {
                URL url = new URL(path);
                URLConnection connection = url.openConnection();

                // getting file length, will cause the first
                // read of the HTTP headers
                int lenghtOfFile = connection.getContentLength();

                // create an input stream to grab data from the connection
                InputStream input = new BufferedInputStream(connection.getInputStream(), 8192);

                // do a read to null basically and grab the entire image, forcing
                // the internals of the stream implementations to do what they must
                // do to push the image to us.
                byte data[] = new byte[1024];
                long total = 0;
                int count;
                while ((count = input.read(data)) != -1) {
                    total += count;
                }

                // close the connection
                input.close();
            } catch (Exception e) {
                return -1;
            }

            long stopTime = System.currentTimeMillis();

            return (int) (stopTime - startTime);
        }
</pre>
<h4>Uploading</h4>
<pre class="brush: java; title: ;">
        private int doUpload(String filePath, String path) {
            long startTime = System.currentTimeMillis();

            try {
                // No simple way to do a multipart post, so we
                // build it from scratch in a sense.
                HttpClient httpClient = new DefaultHttpClient();
                HttpContext localContext = new BasicHttpContext();
                HttpPost httpPost = new HttpPost(path);

                MultipartEntity entity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE);
                entity.addPart(&quot;image&quot;, new FileBody(new File (filePath)));

                httpPost.setEntity(entity);
                HttpResponse response = httpClient.execute(httpPost, localContext);
            } catch (IOException e) {
                e.printStackTrace();
                return -1;
            }

            long stopTime = System.currentTimeMillis();

            return (int) (stopTime - startTime);
        }
</pre>
<p>Quite frankly, they leave a lot of questions to be answered. To really understand what&#8217;s going on, we need to dive into the internals of the <code>BufferedInputString</code> class, and do some TCP dump captures of the packet data. It&#8217;s of great interest to see how the request headers and body are broken up, controlling TCP packet segmentation could have big gains when transferring images. As I said before however, this is only preliminary exploration, so we&#8217;ll leave all that to be answered in the future.</p>
<h3>Results</h3>
<p>I left my sample application service running on my phone as I went for a run over the course of two hours in the city. I figured this would give me a decent spread of typical mobile conditions, in a diverse set of areas, and give decent data. I collected approximately 1500 data points and produced some histograms of the data.</p>
<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/analysis.png" alt="" title="analysis" width="680" height="800" class="aligncenter size-full wp-image-927" /></p>
<p>Taking a look at the data already yields some really interesting insights. Images are generally transfered within 2-3 seconds, that&#8217;s the common case, with an extremely long tail of extended delivery. Interestingly enough I didn&#8217;t have a single timeout, but some images took over a minute to transfer, which is completely unacceptable for the end-user experience, considering the size of the images involved. </p>
<p>The difference in size as compared to transfer time was approximately linear. Uploading of the small image seemed to exceed expectations, and didn&#8217;t actually have much of a long tail like other transfers. I&#8217;m guessing the length of the tail is a function of the number of round-trips required to successfully transfer the data. Examining the packet flow of these images would yield some interesting insight into that relationship. </p>
<h3>Conclusions</h3>
<p>So this really just creates a bunch of further questions. It&#8217;s a really high-level view of delivery times for small files on the phone, but closer examination could yield some more interesting information. I&#8217;m really curious about applications of this knowledge in designing efficient image uploading algorithms. Relaxing some of the guarantees of TCP might allow for significantly faster image upload, and data transfers more suited for mobile apps. Expect to see more along this vector in the blog in the future! </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/04/13/reasoning-about-phone-tcp-performance/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Lightweight Java Development: Resist the Frameworks</title>
		<link>http://andrewbrobinson.com/2012/04/09/lightweight-java-development-resist-the-frameworks/</link>
		<comments>http://andrewbrobinson.com/2012/04/09/lightweight-java-development-resist-the-frameworks/#comments</comments>
		<pubDate>Mon, 09 Apr 2012 17:08:46 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=899</guid>
		<description><![CDATA[Making the transition from Javascript&#8217;s eerily flexible dynamic typing system, where anything is possible, and everything is loosely specified, to the world of Java programming has been jarring. I&#8217;ve started working on some significant backend services, where the workflow demanded stronger reliability guarantees. Java was the natural choice for these services, being a strongly typed [...]]]></description>
			<content:encoded><![CDATA[<p>Making the transition from Javascript&#8217;s eerily flexible dynamic typing system, where anything is possible, and everything is loosely specified, to the world of Java programming has been jarring. I&#8217;ve started working on some significant backend services, where the workflow demanded stronger reliability guarantees. Java was the natural choice for these services, being a strongly typed language gives much stronger compile-time guarantees of program correctness and prevents unexpected runtime bugs from popping up quite as frequently. Additionally a number of attractive Java-specific libraries that perform much of what I wish to accomplish are readily available, making Java even more attractive.</p>
<p>Java really is a nice language for all the criticism it receives. While it doesn&#8217;t quite match the &#8216;batteries-included&#8217; model of Python, the built-in standard library is excellent, striking a nice balance between generality and performance. The standard containers are well-matched and don&#8217;t try to hide the underlying data structures, which is appreciated. There&#8217;s a vast number of libraries available that can perform almost any task available for you. Even concurrency support is decent, with the excellent <code>java.util.concurrent</code> library providing many thread-safe collections, and monitor/lock semantics for tightly coupled threading. </p>
<p>The garbage-collected nature of the runtime environment, while sacrificing some performance, takes a lot of focus away from low-level details and memory management, which is not easy or directly related to the goals we often wish to accomplish as programmers. The syntax is a little verbose, but it does the job and scores major points for being not Visual Basic or Cobol. </p>
<p>Java by itself is really not that bad, however, anytime I&#8217;ve gone near a certain corner of the Java world I&#8217;ve found myself irritated. In my opinion, by far the largest problem with the Java language is the attachment of this entire enterprise mindset. The entire Java enterprise ecosystem feels so bloated by frameworks and technologies that it&#8217;s almost unusable. At multiple points in my life I&#8217;ve spent a few hours trying to wrap my head around this strange world, but even creating a definition for what exactly the composition of a J2EE application is seems like a venture in self-torment. I feel as if a lot of this stuff exists for the sole purpose of making the ecosystem more complex and raising the barrier to entry to ensure that developers need additional training to work competently within it. </p>
<p>Coming from the recent work I&#8217;ve done in the Node.js ecosystem, where developers focus on keeping libraries intentionally small and compact, and making obvious, light-weight APIs, the world of heavy Java frameworks seems silly. I do not need a <code>GenericToasterDataSourceFactorAdapter</code>, I need a <code>Toaster</code>, and would prefer to instantiate one directly. </p>
<p>So far I&#8217;ve started to take a minimalistic approach to Java programming with some great success. I&#8217;m not using a framework to manage my database connections, or message passing servers, I&#8217;m using the connection objects directly. I&#8217;ve written thin wrappers around them to handle basic reconnection logic, and connection pooling using existing libraries such as <a href="http://sourceforge.net/projects/c3p0/">3CP0</a> when needed. When a connection-level error occurs it&#8217;s much more obvious what should be done this way. I can take context-specific actions based on the demands of my application when a service goes down and perform actions such as simply blocking until the service becomes available, or silently dropping the message in cases of logging and other areas that only need soft delivery guarantees. Threading is done with a thread-pool, that can be tuned to the needs of my application and messages are passed directly from the RabbitMQ object, with some glue code managing loss of connection for me as well. This all is working really, really well, and I&#8217;ve been thrilled with the performance and reliability of the service in practice. </p>
<p>My original approach involved picking a framework and building on top of it. I found myself spending so much time learning the specifics of that framework, and looking up the syntax and design patterns, when what I wanted to accomplish was not nearly as complicated as the intricate dances I had to play with the framework itself. I really wouldn&#8217;t recommend this to anyone, it was tedious and frustrating. </p>
<p>What I would recommend is keeping your application reasonably decoupled. Services like RabbitMQ allow for really nice decoupling of the various aspects of application development, while solving a lot of the typical problems for you. Use messaging like this, and build small, purposeful services for executing specific tasks. Don&#8217;t be an enterprise architect and end up with a <code>GenericFactoryWorkerApplication</code>. Build the application itself, and fall back on frameworks and other enterprisey stuff if it becomes obvious the requirements are so great than your ability to implement them in light-weight code is not sufficient. In practice you&#8217;ll be surprised how seldom this happens. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/04/09/lightweight-java-development-resist-the-frameworks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Even PayPal doesn&#8217;t Handle Edge Cases Properly</title>
		<link>http://andrewbrobinson.com/2012/04/01/even-paypal-doesnt-handle-edge-cases-properly/</link>
		<comments>http://andrewbrobinson.com/2012/04/01/even-paypal-doesnt-handle-edge-cases-properly/#comments</comments>
		<pubDate>Sun, 01 Apr 2012 19:23:40 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=890</guid>
		<description><![CDATA[While working on web-based systems one of the most frustrating aspects of development is catching all the edge cases. It&#8217;s not enough to simply anticipate the common case, or design around the expected use case. In the wild world of the web user input is messy and dirty and requires validation, and the code must [...]]]></description>
			<content:encoded><![CDATA[<p>While working on web-based systems one of the most frustrating aspects of development is catching all the edge cases. It&#8217;s not enough to simply anticipate the common case, or design around the expected use case. In the wild world of the web user input is messy and dirty and requires validation, and the code must have defined return paths for all possible error conditions. On the backend of the system if the database server goes down, your application should fail gracefully with an appropriate error message to the end-user, while trying to reestablish a connection to the server appropriately, and monitoring machinery should work to notify the administrator of the system outage, and perhaps take preliminary actions to try and restore service. On the frontend of the system, no matter what user input is received, it should be validated to a standard of what should be expected for that field, escaped properly to avoid any cross-scripting or code injection vulnerabilities, and formatted in a uniform manor for storage and display. </p>
<p>To this end, a great deal of the code written will be dealing with edge cases, and in my opinion one of the best measures of the maturity of a web-based development effort is how well it handles odd or malformed input. One of the best examples of a situation where this is especially evident is how well a service handles e-mail addresses. <a href="http://tools.ietf.org/html/rfc2822#section-3.4.1">RFC 2822</a> gives the formal definition for what is allowed in an e-mail address, and the brave have even implemented this described standard in a machine-readable fashion, creating a beautiful, legible regular expression:</p>
<pre class="brush: plain; title: ;">
(?:[a-z0-9!#$%&amp;'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&amp;'*+/=?^_`{|}~-]+)*|&quot;(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*&quot;)@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
</pre>
<p>Fortunately most e-mail addresses only exist in a subset of this standard, containing mostly alphanumeric identifiers, with the occasional period thrown in, and many services limit e-mail addresses to this subset. For many e-mail services there&#8217;s also a feature that allows for sub-addressing of an e-mail address with a plus or hyphen operator. The most widely used case of this is GMail allowing for a plus sign and any identifier to be appended to your e-mail address. For example if my address was <code>joe.smith@gmail.com</code> I could also use <code>joe.smith+work@gmail.com</code> to direct e-mail to my inbox, and optionally do some filtering based on the To address field, using that tag. </p>
<p>The plus sign is an interesting case that&#8217;s often handled poorly. I&#8217;ve seen behaviors ranging from rejecting the perfectly valid e-mail address (which is almost justifiable- users aliasing their address with a sub-address are more than most likely trying to create multiple accounts), to allowing the account to be created, but never allowing one to log into the account. My most recent experience with this odd behavior has come from PayPal. </p>
<p>I don&#8217;t typically consider PayPal a company that is on top of things technologically speaking, their service is slow with requests often timing out, and their APIs are terrible (see <a href="https://stripe.com">Stripe</a> for an example of payment APIs done right), but I generally expect above all else for their services to be correct in their behavior.</p>
<p>For unknown reasons my previous PayPal account was placed on hold pending confirmation of my address. This confirmation process can only be done via home phone or by shipping some sort of post card to my residence. I don&#8217;t really care enough to complete this process, I&#8217;d rather just register a new account, almost every bank account and credit card attached to the previous account has expired or been closed, so I decided to take a shot in the dark and try the e-mail subaddressing trick, and swiftly added a &#8216;+1&#8242; to my e-mail account and registered again. Surprisingly PayPal allowed this sketchy behavior, and merrily sent me on my way. However, when they sent me the activation e-mail things didn&#8217;t go quite so well:</p>
<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/04/Screen-Shot-2012-04-01-at-2.45.56-PM.png" alt="" title="Screen Shot 2012-04-01 at 2.45.56 PM" width="650" class="aligncenter size-full wp-image-892" /></p>
<p>It turns out that the activation link contains the e-mail address I registered with as part of the query string. It&#8217;s standard practice when parsing a query string to substitute plus signs with spaces, which PayPal does appropriately, and in turn breaks the URL. </p>
<p>Obviously this isn&#8217;t a huge security flaw, or really much more than an inconvenience, but it&#8217;s a little odd to see a company who&#8217;s business is performing secure transactions not even escape a query string correctly.</p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/04/01/even-paypal-doesnt-handle-edge-cases-properly/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Basic RPCs in Node.js with a Java Backend</title>
		<link>http://andrewbrobinson.com/2012/03/28/basic-rpcs-in-node-js-with-a-java-backend/</link>
		<comments>http://andrewbrobinson.com/2012/03/28/basic-rpcs-in-node-js-with-a-java-backend/#comments</comments>
		<pubDate>Wed, 28 Mar 2012 02:30:23 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Node.js]]></category>
		<category><![CDATA[RabbitMQ]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=880</guid>
		<description><![CDATA[The Node.js people provide a really nice library for AMQP interactions in Java, with really well-developed remote procedure call (RPC) libraries that work wonderfully if you&#8217;re using Java on all ends of your project. Unfortunately I&#8217;m not, so they are useless to me, and too complex for me to care about enough to implement in [...]]]></description>
			<content:encoded><![CDATA[<p>The Node.js people provide a really nice library for AMQP interactions in Java, with really well-developed remote procedure call (RPC) libraries that work wonderfully if you&#8217;re using Java on all ends of your project. Unfortunately I&#8217;m not, so they are useless to me, and too complex for me to care about enough to implement in another language. My current project involves a Node.js frontend that handles API requests, and farries off the heavy-lifting to Java backend services when necessary. With that in mind, and keeping with the idea of simple, clean implementations, which has been really popular theme within the Javascript community (and a damn good one), I set out to make an RPC library that wasn&#8217;t quite as complex. </p>
<h3>At 30,000 Feet</h3>
<p>I&#8217;ll recommend taking a look at the <a href="http://www.rabbitmq.com/tutorials/tutorial-six-python.html">sixth tutorial</a> in the RabbitMQ documentation for a really nice introduction to the mechanics of a RPC request. I&#8217;ll shamelessly steal their diagram and present it below:</p>
<p><img alt="" src="http://www.rabbitmq.com/img/tutorials/python-six.png" title="RabbitMQ RPC" class="aligncenter" width="576" height="200" /></p>
<p>The sequence looks something like this:</p>
<ol>
<li>The client, upon connecting to the broker server, creates a queue with a server-generated name for receiving the eventual replies to RPCs. This queue name is stored in a variable called <code>responseQueue</code>, and a receiving function is bound to it to handle a response from a RPC. We also create a field called <code>correlationId</code> to store an incrementing counter, which allows us to later match up requests with responses.</li>
<li>When a RPC request needs to be made a message is published to a request queue, in the message headers we set the <code>replyTo</code> and <code>correlationId</code> parameters, using the established values above. We store a callback function in a map, keyed with the current <code>correlationId</code>, and also set a timeout timer to ensure that an action is taken if the RPC request isn&#8217;t processed in time.</li>
<li>A server bound to the <code>requestQueue</code> received the request, does any necessary processing, constructs an appropriate response to be sent to the <code>responseQueue</code>, which has been specified in the <code>replyTo</code> header, and finally acknowledges receipt of the request.</li>
<li>Assuming the response is received in time, the client will finally look up the <code>correlationId</code> in the map of stored callbacks, disable the timeout timer, and finally fire off the callback function</li>
</ol>
<p>Node.js couples really nicely with this model, there is very little impedance mismatch between this ideal model and the typical callback pattern observed in Node, so implementation is really straight forward. Let&#8217;s take a look at how this is put together.</p>
<h3>Connecting in Node to a RabbitMQ Server</h3>
<p>First thing is first, let&#8217;s create a connection. I&#8217;ll be using the excellent <a href="https://github.com/postwait/node-amqp">node-amqp</a> library for communication. Setting up a connection to a RabbitMQ broker involves a function with a ready callback:</p>
<pre class="brush: jscript; title: ;">
var responseQueue = null;
var correlationId = 0;
var rpcRequestMap = {};
var connection = null;

function connect() {
    // Setup the connection object, with associated callbacks for connection events.
    connection = amqp.createConnection();

    connection.on('ready', function() {
        console.log('message.connect: ampq connection established');
        // by not specifying a queue name, the server will assign us one randomly
        // by specifying the exclusive option we ensure the queue will be cleaned up
        // upon application exit
        connection.queue('', {exclusive: true}, function(queue) {
            console.log('message.connect: rpc queue created: ' + queue.name);
            queue.subscribe(handleRpcResponse);
            responseQueue = queue;
        });
    });    

    connection.on('error', function() {
        console.log('message.connect: connection error');
    });

}
</pre>
<p>The node-amqp library provides default connection parameters for a standard RabbitMQ installation. You&#8217;ll notice that upon creating the queue we call the <code>subscribe</code> event with a callback function. We&#8217;ll come back to that later, first let&#8217;s send a request off to an RPC server:</p>
<pre class="brush: jscript; title: ;">
function doRpc(requestQueue, payload, callback) {
    var thisId = correlationId;
    correlationId = correlationId + 1;

    // setup the object in the callback map for
    // storing our callback function and the unique timer id
    // node assigns
    rpcRequestMap[thisId] = {};
    rpcRequestMap[thisId].callback = callback;

    // setTimeout is part of the node core and will return
    // a unique timer handle that we can use to control
    // the created timer. we set up this timer in case the
    // rpc server isn't running or has crashed
    rpcRequestMap[thisId].timer = setTimeout(function() {
        console.log('rpc timeout');
        var fn = rpcRequestMap[thisId].callback;
        delete rpcRequestMap[thisId];
        fn('rpc timeout', null);
    }, 5000);

    client.publish(requestQueue, payload,
        {replyTo: rpcQueue.name, 'correlationId': String(thisId), mandatory: true});

    console.log('message.doRpc: message published, requestMap binding created, id:' +
        String(thisId));
}
</pre>
<p>This function takes in a set of function arguments in the <code>payload</code> variable, a request queue name, which can be thought of as the RPC function name, and a callback function, which must follow the typical Node pattern of <code>function(err, data)</code>. This will send off the request to the broker, which, assuming a RPC server is running, will deliver the message. </p>
<p>Take notice of the implied and explicit queue and message parameters. Out queue is setup with the <code>exclusive</code> parameter defined, ensuring that only this instance of our client can consume messages from it, and ensuring that it will be deleted when the connection is closed. Our messages are sent without persistence flags and with the <code>mandatory</code> flag set. If the RPC server is unavailable it&#8217;s unlikely we&#8217;d want the message to be processed, so this allows the broker to drop it instead.</p>
<p>Using the official Java API for Rabbit, let&#8217;s build the server now:</p>
<pre class="brush: java; title: ;">
package fusao.tangifo.backend;

import java.io.IOException;
import com.rabbitmq.client.*;

public class Processor {
    public static void main(String [] args) {
        System.out.println(&quot;Processor: initializing&quot;);
        // Our main AMQP connection, we'll open
        // a channel per thread later with a threadpool.
        ConnectionFactory factory = new ConnectionFactory();
        factory.setUsername(&quot;guest&quot;);
        factory.setPassword(&quot;guest&quot;);
        factory.setHost(&quot;localhost&quot;);
        factory.setPort(5672);

        // Let's connect and setup our basic queues.
        System.out.println(&quot;connecting to AMQP server...&quot;);
        Connection conn = null;
        final Channel channel;

        try {
            conn = factory.newConnection();
            channel = conn.createChannel();
        } catch (IOException e) {
            System.out.println(&quot;failed to create channel or connect to server&quot;);
            e.printStackTrace();
            return;
        }

        System.out.println(&quot;connected.&quot;);

        try {
            // Setup the queue, if it's not already declared this will
            // create it.
            // durable - false
            // exclusive - false
            // autoDelete - true
            // arguments - none
            channel.queueDeclare(&quot;image&quot;, false, false, true, null);

            // Add a callback for when messages arrive at the queue
            // autoAck - false
            channel.basicConsume(&quot;image&quot;, false,
                 new DefaultConsumer(channel) {
                     @Override
                     public void handleDelivery(String consumerTag,
                                                Envelope envelope,
                                                AMQP.BasicProperties properties,
                                                byte[] body)
                         throws IOException
                     {
                         String routingKey = envelope.getRoutingKey();
                         String contentType = properties.getContentType();
                         String correlationId = properties.getCorrelationId();
                         String responseQueue = properties.getReplyTo();
                         long deliveryTag = envelope.getDeliveryTag();

                         String message = new String(body);
                         System.out.println(&quot;message received&quot;);
                         System.out.println(&quot;correlationId: &quot; +
                             correlationId +
                             &quot; responseQueue: &quot; +
                             responseQueue);
                         System.out.println(message);                         

                         AMQP.BasicProperties b = (new AMQP.BasicProperties.Builder())
                             .correlationId(correlationId)
                             .build();

                         channel.basicPublish(&quot;&quot;, responseQueue, b, &quot;{}&quot;.getBytes(&quot;UTF-8&quot;));
                         channel.basicAck(deliveryTag, false);
                     }
                 });
        } catch (IOException e) {
            e.printStackTrace();
            System.out.println(&quot;Something went horribly wrong.&quot;);
            return;
        }
    }
}
</pre>
<p>The Java implementation isn&#8217;t terribly exciting, take a look at where we extract the <code>correlationId</code> and <code>responseQueue</code> from in the <code>DefaultConsumer</code> delivery handler. We publish the message to the default exchange, and just serialize an empty JSON object. </p>
<p>This server will display the parameters to the console, and issue a response. The final piece of the puzzle is the response handler in Node:</p>
<pre class="brush: jscript; title: ;">
var handleRpcResponse = function (message, headers, deliveryInfo) {
    console.log(headers);
    console.log(deliveryInfo);
    if (!deliveryInfo.hasOwnProperty('correlationId') ||
        !rpcRequestMap.hasOwnProperty(deliveryInfo.correlationId) ||
        rpcRequestMap[deliveryInfo.correlationId] === null) {
        console.log('message.handleRpcResponse: stray rpc message received');
        return;
    }
    var thisId = deliveryInfo.correlationId;
    clearTimeout(rpcRequestMap[thisId].timer);
    var cb = rpcRequestMap[thisId].callback;
    delete rpcRequestMap[thisId];
    cb(null, message);
};
</pre>
<p>We delete the map item once we&#8217;ve received the response, call the callback, and log some debug parameters to the console. </p>
<p>So there you have it, this is a really trimmed down version of a RPC pattern for Node.js and Java. I&#8217;ve been using it in testing for a few days now with good results to sling JSON requests back and forth between my frontend and backend server. I really like Node.js for building RESTful APIs, and with a solid messaging layer and backend it forms a really nice stack that has a fast response time, and can scale horizontally with a nice decoupling between heavy-lifting backend services, and the web-facing frontend. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/03/28/basic-rpcs-in-node-js-with-a-java-backend/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Keeping your Sanity when Working on Large Projects</title>
		<link>http://andrewbrobinson.com/2012/03/21/keeping-your-sanity-when-working-on-large-projects/</link>
		<comments>http://andrewbrobinson.com/2012/03/21/keeping-your-sanity-when-working-on-large-projects/#comments</comments>
		<pubDate>Wed, 21 Mar 2012 22:24:13 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Algorithms]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=871</guid>
		<description><![CDATA[The size of my projects, both personal and academic, grows larger and larger as I tackle bigger problems. My most recent personal project is starting to touch 10,000 lines of written code, with many more of third-party modules, and no end in sight. While lines of code are a poor metric for the complexity of [...]]]></description>
			<content:encoded><![CDATA[<p>The size of my projects, both personal and academic, grows larger and larger as I tackle bigger problems. My most recent personal project is starting to touch 10,000 lines of written code, with many more of third-party modules, and no end in sight. While lines of code are a poor metric for the complexity of a project it&#8217;s obvious that there&#8217;s a direct relationship between the two. As these projects continue to get larger, it becomes impossible to keep the entire thing in my head. As I focus on various pieces of it, others fade away, and the whole thing starts to become a little hard to come to terms with. </p>
<p>Even more alarming is that as the codebase grows in complexity, my confidence in it shrinks. When the code is small, it is easy to maintain. I understand the entire system, and see it for what it is. When I don&#8217;t like something about the code, I simply refactor the entire thing, and again I am happy. Once the project grows past a couple thousand lines refactoring gets tougher, and no longer do I feel like I have a good grip on the code. It starts to feel sloppy, and thrown together. I start feeling like there should be a better way, and that my code isn&#8217;t up to standard, and won&#8217;t be maintainable. In essence, I start to think my code sucks. </p>
<p>At first this really concerned me, if I couldn&#8217;t write more than a few thousand lines of code before everything turned into unorganized spaghetti, then I surely was a poor programmer. I started looking through the old code, trying to figure out where I went wrong, and why I felt it was written so poorly. </p>
<p>Taking a look at all of this written code, it turns out that there really wasn&#8217;t anything wrong. Sure, there&#8217;s minor pieces that could be written differently, with a better trick I picked up only after that module was completed, but there&#8217;s nothing terrible about it. No matter where I looked, I couldn&#8217;t find the lines of code that I was so sure sucked. This being the case, if the architectural decisions were decent, and implementations were clean then why did I feel so terrible about the project?</p>
<p>I think this answer has to do with what happens when a program exceeds your ability to fit it in your head. There&#8217;s a point where you can no longer easily hold your entire program in your head, and it happens pretty early on in a large project. There&#8217;s just too much code to keep all of the right assertions and intuitive reasoning patterns needed to fluently work in your head at once, and as a result you have to page out the ones that aren&#8217;t immediate to your current task. </p>
<p>This feeling has bothered me for quite some time, I couldn&#8217;t understand it before. I think the only solution is to just come to terms with it, and accept that anytime a project grows above a certain threshold you will inevitably feel like there&#8217;s some smells coming from already written code. Writing smaller modules, and ensuring things are self-contained with a consistent exception framework has really helped to contain this feeling for me. If you can write code that doesn&#8217;t surprise you later on when invoked, and which performs small, obvious functions then you&#8217;re doing something right. Focusing on creating abstractions with a point of avoiding leakage and planning upfront for code extension has also significantly helped.</p>
<p>If you feel this way too I wouldn&#8217;t worry too much, it only means that you&#8217;re pushing the limits of what you&#8217;re capable of doing, and creating complex systems that actually solve real world problems! Very few projects are small enough to fit in one mind, so simply focus on always creating tidy code when focusing on a specific part of your application, and making a consistent style for error handling and other generic aspects of your code. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/03/21/keeping-your-sanity-when-working-on-large-projects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Obfuscating Database IDs in Node.js with AES</title>
		<link>http://andrewbrobinson.com/2012/03/21/obfuscating-database-ids-in-node-js-with-aes/</link>
		<comments>http://andrewbrobinson.com/2012/03/21/obfuscating-database-ids-in-node-js-with-aes/#comments</comments>
		<pubDate>Wed, 21 Mar 2012 21:05:14 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Node.js]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=865</guid>
		<description><![CDATA[When designing an API there&#8217;s something a little unsettling about returning database column IDs to the user directly. While it&#8217;s a little bit of security via obscurity, I tend to believe that sending the client an incrementing identifier creates an implementation that leaks information. From the incrementing identifier a user can estimate current capacity, and [...]]]></description>
			<content:encoded><![CDATA[<p>When designing an API there&#8217;s something a little unsettling about returning database column IDs to the user directly. While it&#8217;s a little bit of security via obscurity, I tend to believe that sending the client an incrementing identifier creates an implementation that leaks information. From the incrementing identifier a user can estimate current capacity, and it serves as a starting point for exploiting the system, by providing a list of possible values to try to defeat the security and access other users&#8217; objects. </p>
<p>I noticed a lot of services will return a slightly encoded ID such as <code>ch_byLy9Gy1cUkZte</code>, where the prefix will give some sort of indication as to the object type, and the rest is a random looking alphanumeric key. I set out to design something like this, and be able to easily map it back to database identifiers. </p>
<p>Using the OpenSSL-based encryption libraries in Node, I came up with something pretty decent. We encrypt the identifiers using a stored passphrase, along with a prefix, and return it as a base-64 encoded string to the user. </p>
<pre class="brush: jscript; title: ;">
    var encrypt = function(id, prefix) {
        var encrypted = null;
        try {
            var cipher =
                crypto.createCipher('aes-256-cbc', config.obfuscater.key);
            cipher.update(prefix + '_' + String(id), 'ascii', 'base64');
            var cryptString1 = cipher.final('base64').replace(/\=/g, '');
            encrypted = prefix + '_' + cryptString1;
        }
        catch (err) {
            return null;
        }

        return encrypted.replace(/\//g,'.').replace(/\+/g,'-');
    };
</pre>
<p>There&#8217;s a couple strengths to this approach:</p>
<ul>
<li>The incrementing column id is not recoverable without knowing the passphrase used to encrypt the data.</li>
<li>By encrypting the prefix with the id we ensure that one can&#8217;t take a generated ID from a different object type and modify it to access another object. For example, a malicious user can&#8217;t grab pref1_abc and create pref2_abc, the system would reject it upon decoding and realizing the prefixes don&#8217;t match.</li>
<li>We base-64 encode the data, and replace URL-unsafe parameters, leaving us with a string that can easily be transmitted within a GET request.</li>
</ul>
<p>When using this approach, we&#8217;ll generate IDs like this:</p>
<pre class="brush: plain; title: ;">
i: 0 eString: ord_KU.cXCev42ulIW3rc78NxQ
i: 1 eString: ord_Ba1REDoYHMDXfaPj-lEB5A
i: 2 eString: ord_zTevdc8HP52J9wIQB-.42A
i: 3 eString: ord_vIpAhysJx0iqOTSztCz0KA
i: 4 eString: ord_xzxCz0Dzkp73buVd2ASPeA
i: 5 eString: ord_GnZM3eLF6El7rLOIywXt2w
i: 6 eString: ord_ZpHin3pOEg0B3d9rmNOJUw
i: 7 eString: ord_UGCaf9zbnQpqwphHjuSCDg
i: 8 eString: ord_i.0dpeDqY5-lW6ZaylQplQ
i: 9 eString: ord_a7X8upiCXDEMwW8Sd1RwhQ
</pre>
<p>Not too shabby! The end-user has little chance to figuring out the pattern behind the resource locator, and it&#8217;s easily decrypted server side. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/03/21/obfuscating-database-ids-in-node-js-with-aes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Node.js and Callbacks Makes for Good Error Handling</title>
		<link>http://andrewbrobinson.com/2012/03/13/node-js-and-callbacks-makes-for-good-error-handling/</link>
		<comments>http://andrewbrobinson.com/2012/03/13/node-js-and-callbacks-makes-for-good-error-handling/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 17:21:43 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Node.js]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=854</guid>
		<description><![CDATA[Error handling sucks. It&#8217;s awkward and unintuitive and difficult and frustrating. As a developer it&#8217;s often the last thing you want to focus on, yet it&#8217;s critical to any modern application. In most programming languages and frameworks it&#8217;s really difficult to get right. Throwing an exception, by definition, breaks flow control and requires carefully planned [...]]]></description>
			<content:encoded><![CDATA[<p>Error handling sucks. It&#8217;s awkward and unintuitive and difficult and frustrating. As a developer it&#8217;s often the last thing you want to focus on, yet it&#8217;s critical to any modern application. In most programming languages and frameworks it&#8217;s really difficult to get right. Throwing an exception, by definition, breaks flow control and requires carefully planned handlers to preserve state in your application. When you&#8217;re focusing on threaded, sequential processing it&#8217;s often hard to figure out where to handle errors at.</p>
<p>I&#8217;ve been developing a RESTful API with the wonderful <a href="http://mcavage.github.com/node-restify/">node-restify</a> framework and found that error handling is actually quite graceful. There&#8217;s a really powerful pattern that emerges when using the callback paradigm that makes handling errors quite simple. </p>
<p>In Node almost everything happens with a callback, and if you value your sanity you&#8217;ll not nest these callbacks into a 10 level deep soup, you&#8217;ll instead build stacks of tasks, each of which takes in a callback and subsequently executes it upon completion. The really cool thing about these tasks is that they force you to break up your work into small, atomic functions with a clearly defined goal. For example, in my application when a user PUTs a new address there&#8217;s a number of small tasks that must happen. The user&#8217;s session is validated, authentication details are retrieved from the database and verified, we validate they supplied a valid ID for an address entry, we do the update, and finally send back a generic HTTP-200 response. </p>
<p>At any step something could go wrong, and the request could no longer continue. In the real world things like database errors can happen, as can poor user input, or nonexistent database entries. To support handling all of these errors, each function has the opportunity to set an error code in the <code>req.error</code> field, and drop directly out of the callback stack. </p>
<p>By making the actions reasonably self-contained, and adding a degree of atomicness to operations by lastly doing database updates and setting status fields, we actually have a really nice error handling framework. The callback model lends to a clean abstraction, and to breaking things down in this manor, so it&#8217;s almost effortless.  </p>
<h3>Implementation</h3>
<p>I&#8217;ve extended the routing engine inside Restify a little bit to support a flexible, stack-based approach to callbacks. I found the default routing engine to be too limiting, it forces us down a predefined path and as a believer in <a href="http://en.wikipedia.org/wiki/Indeterminism">indeterminism</a> I wanted the flexibility to determine the destination of my route as it was being processed, in response to query parameters and database properties. To do this we setup a route inside the server like this:</p>
<pre class="brush: jscript; title: ;">
server.put('/addresses/:id', routes.address.put);
</pre>
<p>Instead of using the built-in routing, we defer most of the handling to my custom routing engine, where the route is setup like this:</p>
<pre class="brush: jscript; title: ;">
    var routes.address.put = function(req, res, next) {
        req.stack = Array();

        // Standard validation and session stack
        req.stack.push(validator.requireSession);
        req.stack.push(db.session.setup);
        req.stack.push(db.session.updateLastUsed);
        req.stack.push(validator.requireAuthentication);

        // Address specific code
        req.stack.push(validator.requireId);
        req.stack.push(db.address.update);
        req.stack.push(api_responses.session);

        return helper.pop(req, res, next);
    };
</pre>
<p>Not too shabby! Looks really clean, where&#8217;s the error handling even located? One of the requirements of all top-level routing functions is that they can fail, and are atomically clean operations that don&#8217;t leave the global application state (database, file storage, etc) in an inconsistent state. I&#8217;ve taken care to ensure that each function does something that doesn&#8217;t require further processing once completed to return the application to a consistent state. Inside each function, it has the opportunity to set an error code, and then the machinery that links these functions together handles termination of the invokation chain, and printing of the error message. Let&#8217;s take a look at one of the most basic versions of this function:</p>
<pre class="brush: jscript; title: ;">
    var validator.requireId = function(req, res, fn) {
        if(!checkPreconditions(req, ['id'])) {
            req.error = new restify.MissingParameterError('Missing parameters.'));
        }
        return fn();
    };
</pre>
<p>It has access to the typical <code>req</code> and <code>res</code> objects exposed by Express or Restify, and a callback function. This one simply does error checking, but if an error is found it sets up an error in the <code>req</code> object. Upon returning we break into the following function, which was first called at the end of the routing function I defined above. </p>
<pre class="brush: jscript; title: ;">
var pop = function(req, res, next) {
        if(req.hasOwnProperty('error')) {
            console.log(req.error);
            return next(req.error);
        }

        var fn = req.stack.shift();
        if(fn) {
            return fn(req, res, next, function() {
                module.exports.pop(req, res, next);
            });
        }
        next();
    };
</pre>
<p>A semi-clever function that simply shifts elements off the stack of tasks, runs them, and passes itself as the callback, with a localized scope. The error handling magic happens when we check the existence of <code>req.error</code> at every entry, at which point in time we invoke Restify&#8217;s built-in error processor. </p>
<p>Combining this with the <code>node-logging</code> library to give a primitive form of function tracing, and some nice database and validation wrappers I&#8217;ve created a really clean error solution. When the database is unavailable, HTTP-500 errors are bubbled up appropriately, and user input validation as well as existence checks are handled in the same manor. It&#8217;s a really refreshing break from threaded programming to see such a clean way to handle errors, and while the callback approach can be tedious at times, seeing these kinds of clean abstractions fall out of it so often really suggests that it&#8217;s a very appropriate way to write request processing servers, not to mention the performance benefits of the callback concurrency model when faced with IO-bound requests. </p>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/03/13/node-js-and-callbacks-makes-for-good-error-handling/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A New Term: Goldberg (adj)</title>
		<link>http://andrewbrobinson.com/2012/03/06/a-new-term-goldberg-adj/</link>
		<comments>http://andrewbrobinson.com/2012/03/06/a-new-term-goldberg-adj/#comments</comments>
		<pubDate>Tue, 06 Mar 2012 20:13:44 +0000</pubDate>
		<dc:creator>Andrew Robinson</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://andrewbrobinson.com/?p=842</guid>
		<description><![CDATA[I&#8217;ve been reading a lot of academic systems papers lately, and a lot of them are a little frustrating. I think that in this world there&#8217;s only so many ways one can implement a task scheduler before the law of diminishing returns takes over and the effort spent improving the system starts to look a [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://andrewbrobinson.com/wp-content/uploads/2012/03/rube_napkin-300x211.gif" alt="" title="rube_napkin" width="300" height="211" class="alignleft size-medium wp-image-843" />I&#8217;ve been reading a lot of academic systems papers lately, and a lot of them are a little frustrating. I think that in this world there&#8217;s only so many ways one can implement a task scheduler before the law of diminishing returns takes over and the effort spent improving the system starts to look a little silly. Some of the papers seem as if they have been published simply due to pressure to publish. </p>
<p>Realizing this, and reading about the history and origin of some programming languages in article I found, <a href="http://tagide.com/blog/2012/03/research-in-programming-languages/">Research in Programming Languages</a>, I&#8217;ve become a little frustrated with these papers. As a good example, half of the languages we use today were designed in a few days by someone outside of the academic world, yet power the majority of the modern web. I&#8217;ve read a couple papers about scheduling algorithms, that promise great advances over the current threading mechanisms we use, but despite being published years ago we still use the same, simple threading mechanisms. I just feel like a lot of these system designs are unnecessary and needly complex, for the sake of complexity itself, while not realizing significant performance increases. In this spirit I hereby christen a new adjective to describe some of these systems, in honor of their ridiculous complexity, which is surpassed only by their irrelevancy. </p>
<div class="aligncenter" style="width:600px; border:2px dashed #000000 !important; padding: 5px; background-color:#eeeeee; ">
  <b>goldberg</b> <i>(adj.)</i> &#8211; </p>
<ol>
<li> describing a system that has become so<br />
         large and poorly designed that it can be<br />
          likened to a Rube Goldberg machine, where<br />
         a large number of arbitrary tasks are performed<br />
         in sequential order to achieve an otherwise<br />
         simple goal in a convoluted, unintuitive, and unnecessarily<br />
         complex manor.
    </li>
</ol>
<p>  <b>Synonyms</b> &#8211; academic, research, thesis-work
</div>
]]></content:encoded>
			<wfw:commentRss>http://andrewbrobinson.com/2012/03/06/a-new-term-goldberg-adj/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

