Archive for the ‘Uncategorized’ Category
A JNI Wrapper for Speex on Android
After reading this article on installing Speex for Android I realized that it was severely lacking. The author never presents the JNI interface, and some of the ndk-build flags are incorrect for my platform.
Here’s a set of steps to setup Speex for Android, and building a simple wrapper, assuming you have the Android NDK installed. If you don’t have the NDK installed, download it from the Android homepage and add the bin directory to your path statement.
- Download the latest Speex source tarball from the downloads page.
- Extract the files and copy the include and libspeex directories into your JNI directory in a current Android project.
- Create a file called native.c and add the following code listing to it, this is the missing JNI interface:
#include <jni.h> #include "speex/speex.h" #define FRAME_SIZE 320 int nbBytes; /*Holds the state of the encoder*/ void *state; /*Holds bits so they can be read and written to by the Speex routines*/ SpeexBits bits; int i, tmp; void Java_fusao_awesome_TestAppActivity_init(JNIEnv * env, jobject jobj) { /*Create a new encoder state in narrowband mode*/ state = speex_encoder_init(&speex_wb_mode); /*Set the quality to 8*/ tmp=8; speex_encoder_ctl(state, SPEEX_SET_QUALITY, &tmp); /*Initialization of the structure that holds the bits*/ speex_bits_init(&bits); } jbyteArray Java_fusao_awesome_TestAppActivity_encode(JNIEnv * env, jobject jobj, jshortArray inputData) { jbyteArray ret; jshort * inputArrayElements = (*env)->GetShortArrayElements(env, inputData, 0); /*Flush all the bits in the struct so we can encode a new frame*/ speex_bits_reset(&bits); /*Encode the frame*/ speex_encode_int(state, inputArrayElements, &bits); /*Copy the bits to an array of char that can be written*/ nbBytes = speex_bits_nbytes(&bits); ret = (jbyteArray) ((*env)->NewByteArray(env, nbBytes)); jbyte * arrayElements = (*env)->GetByteArrayElements(env, ret, 0); speex_bits_write(&bits, arrayElements, nbBytes); (*env)->ReleaseShortArrayElements(env, inputData, inputArrayElements, JNI_ABORT); (*env)->ReleaseByteArrayElements(env, ret, arrayElements, 0); return ret; }Notice that the function names need to be modified to fit your project. In this case I was working in the fusao.awesome namespace, building an activity called TestAppActivity, modify your code to match what you’re doing.
JNI is not an Android-specific technology, there are plenty of JavaDoc and Oracle resources available for writing the JNI interfaces. At first it’s not intuitive, but the documentation should help you to write your own interfaces. I’ve listed some additional resources at the end of this post for further customization.
I’ve implemented a really simple encode function, and initializer. The encode function simply takes a frame (20ms at 16kHz) of audio, and encodes it using the default wideband options. It should easily be modified to accommodate additional functions, such as preprocessing to remove non-voice data.
- Create a file called Android.mk, to handle the build configuration. Paste in the following configuration:
LOCAL_PATH := $(call my-dir) include $(CLEAR_VARS) LOCAL_MODULE := libspeex LOCAL_CFLAGS = -DFIXED_POINT -DUSE_KISS_FFT -DEXPORT="" -UHAVE_CONFIG_H LOCAL_C_INCLUDES := $(LOCAL_PATH)/include LOCAL_SRC_FILES := \ ./libspeex/bits.c \ ./libspeex/buffer.c \ ./libspeex/cb_search.c \ ./libspeex/exc_10_16_table.c \ ./libspeex/exc_10_32_table.c \ ./libspeex/exc_20_32_table.c \ ./libspeex/exc_5_256_table.c \ ./libspeex/exc_5_64_table.c \ ./libspeex/exc_8_128_table.c \ ./libspeex/fftwrap.c \ ./libspeex/filterbank.c \ ./libspeex/filters.c \ ./libspeex/gain_table.c \ ./libspeex/gain_table_lbr.c \ ./libspeex/hexc_10_32_table.c \ ./libspeex/hexc_table.c \ ./libspeex/high_lsp_tables.c \ ./libspeex/jitter.c \ ./libspeex/kiss_fft.c \ ./libspeex/kiss_fftr.c \ ./libspeex/lpc.c \ ./libspeex/lsp.c \ ./libspeex/lsp_tables_nb.c \ ./libspeex/ltp.c \ ./libspeex/mdf.c \ ./libspeex/modes.c \ ./libspeex/modes_wb.c \ ./libspeex/nb_celp.c \ ./libspeex/preprocess.c \ ./libspeex/quant_lsp.c \ ./libspeex/resample.c \ ./libspeex/sb_celp.c \ ./libspeex/scal.c \ ./libspeex/smallft.c \ ./libspeex/speex.c \ ./libspeex/speex_callbacks.c \ ./libspeex/speex_header.c \ ./libspeex/stereo.c \ ./libspeex/vbr.c \ ./libspeex/vq.c \ ./libspeex/window.c \ ./native.c include $(BUILD_SHARED_LIBRARY)
- Open up jni/include/speex/speex_config_types.h (create it if not already present) and add the following bit:
#ifndef __SPEEX_TYPES_H__ #define __SPEEX_TYPES_H__ typedef short spx_int16_t; typedef unsigned short spx_uint16_t; typedef int spx_int32_t; typedef unsigned int spx_uint32_t; #endif
- In a command line of the main directory of your project run ndk-build, ensure that it completes successfully.
- In the activity you’re creating add the following framework inside the class to tell Java what the function definition looks like for your native interface:
native byte[] encode(short [] inputData); native void init(); static { System.loadLibrary("speex"); } - Ensure you build your project again, now you should be able to envoke your JNI functions:
short [] inputArray = new short[320]; // Write data to inputArray. byte [] encodedBuffer = encode(inputArray);
And that’s all there is to it!
Additional Resources
JavaDocs on Array JNI Function Specifications
A JNI Example from Oracle
A nice tutorial on building a default JNI applicatoin
Next time I’ll be presenting a complete code listening that demonstrates taking this interface, wrapping it with Google ProtoBuffers, sending it over a socket, and reconstructing the Speex packets on a server to play back on a personal computer!
Capturing Raw Audio Data in Android
I’m working on a JNI wrapper for the popular Speex codec for VoIP recording as part of a project to create a continuously available background audio stream. Setting up the Android audio interface can be a little unintuitive and I couldn’t find a decent example that demonstrated this in a concise fashion, so without further ado:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import android.app.Activity;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.AudioRecord.OnRecordPositionUpdateListener;
import android.media.MediaRecorder;
import android.os.Bundle;
import android.os.Environment;
import android.util.Log;
import android.content.Context;
public class TestAppActivity extends Activity {
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
Log.i("Main", "Starting main audio capture loop...");
// Capture mono data at 16kHz
int frequency = 16000;
int channelConfiguration = AudioFormat.CHANNEL_CONFIGURATION_MONO;
int audioEncoding = AudioFormat.ENCODING_PCM_16BIT;
// The minimal buffer size CANNOT be merely 20ms of data, it must be
// at least 1024 bytes in this case, this is most likely due to a MMIO
// hardware limit.
final int bufferSize = AudioRecord.getMinBufferSize(frequency, channelConfiguration, audioEncoding);
// Setup the audio recording machinery
AudioRecord audioRecord = new AudioRecord(MediaRecorder.AudioSource.VOICE_RECOGNITION,
frequency, channelConfiguration,
audioEncoding, bufferSize);
// The short and file buffers, this might not be the most
// efficient way to do things, but since we're planning on
// redirecting this data into an encoder in a later version
// of this project, we're not worried about it.
// 320 = 16kHz * 20ms - Number of frames of audio required.
short[] buffer = new short[320];
byte[] fileBuffer = new byte[320 * 2];
audioRecord.startRecording();
FileOutputStream f = null;
File sdCard = Environment.getExternalStorageDirectory();
File dir = new File(sdCard.getAbsolutePath() + "/audioTest");
dir.mkdirs();
File file = new File(dir, "testAudio.wav");
try {
f = new FileOutputStream(file, true);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
// Blocking loop uses about 40% of the CPU to do this.
int sampleNumber = 0;
// We'll capture 3000 samples of 20ms each,
// giving us 60 seconds of audio data.
while(sampleNumber < 3000) {
audioRecord.read(buffer, 0, 320);
for(int i = 0; i < buffer.length; i++) {
fileBuffer[i*2] = (byte)(buffer[i] & (short)0xFF);
fileBuffer[i*2 + 1] = (byte)(buffer[i] >> 8);
}
try {
f.write(fileBuffer);
} catch (IOException e) {
e.printStackTrace();
}
sampleNumber++;
}
try {
f.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
When initially writing this code I attempted to use the periodic callback features in the AudioRecord class with little success. Others online have reported mixed results, with the consensus being that capturing audio data in a blocking loop in a separate thread was sufficient. and occurred no additional CPU usage.
The code is setup to receive data in 20ms chunks, which will eventually be fed into the native Speex encoder for sending across the network.
The project I’m working on is power-sensitive, so I did some benchmarking to determine how the buffer size affects CPU usage, which is directly correlated to power consumption.
Interestingly enough CPU usage can be lowered by extending the buffer and periodically writing the samples, the summarized data is listed below:
| Sample Number | Approximate CPU Usage |
|---|---|
| 320 (20ms) | 55% |
| 4096 (256ms) | 40% |
| 16,000 (1s) | 19% |
| 80,000 (5s) | 12% |
The take away is that for applications where background processing is occurring, and the results do not need to be near-instantaneous, a much better power consumption profile can be achieved by using a larger buffer.
I still have some questions about the hardware interface to the audio driver. It would be really interesting to know if it’s a memory mapped system, that triggers an interrupt when a certain buffer size is achieved, or if the CPU busy-waits on the audio device and fills a buffer. Without knowing the answer to this question I think it’s tough to create really effective audio listeners. As always, with API abstraction it’s hard to nail down what the most efficient implementation would look like without knowing the details underneath the hood.
Rudimentary DigiKey Part Scraper
DigiKey has done a wonderful thing for those of us who tinker with electronics. It allows for us to purchase small quantities of literally millions of different parts. They have an extremely comprehensive database full of almost every electronic part in production today, however what they are lacking is a proper API.
To this end I have created a simple wrapper around the DigiKey HTML interface suitable for scraping pricing and stock levels for items, based on DigiKey part number. I’ve posted this code in a GitHub repo here.
For an example on how to use this API see below:
using DigikeyApi;
Product p = Scraper.scrapePart("PART NUMBER HERE-ND");
if (p != null) {
Console.WriteLine("{0}- {1} in stock @ ${2} for 100.",
p.partNumber, p.quantityAvailiable, p.calculatePrice(100));
}
else {
errorParts.Add(part);
Console.WriteLine("Part lookup failed.");
}
Using Tweepy to access the Twitter Stream
I’ve dove head-first into Python lately for part of a new natural language processing project. Part of this project involves collecting tweets and inserting them into the database for later analysis. To accomplish this goal I set out to find a good API already written. I found Tweepy on GitHub and it seemed to do the trick. The problem is most of the code was written before OAuth became a requirement, and while they supported accessing the Twitter stream, there was no solid example. To that end I’ve examined the source and written an example that does just that.
Ensure you clone or download the code directly from the Git repo. The current stable release does not include support for OAuth in the stream module.
OAuth Authentication
The first step in using OAuth is to obtain your API keys. They can be obtained on the Twitter App Developers site. You’ll need both a consumers key, and an access token, with their respective secrets to successfully communicate. Once you have downloaded Tweepy and obtained the keys you can start a new python script and create an instance of the api module.
import tweepy
auth1 = tweepy.auth.OAuthHandler('CONSUMER KEY','CONSUMER SECRET')
auth1.set_access_token('ACCESS TOKEN','ACCESS TOKEN SECRET')
api = tweepy.API(auth1)
At this point you have a fully authenticated API module. To test things out you can post a tweet to your account.
api.update_status('This is a test!')
If all is good, you’ll see this tweet appear when you visit your homepage on Twitter.
Creating a Stream Listener
The Twitter stream operates by holding open an HTTP connection and continuously sending JSON packets across it containing a structure that represents the most recent tweets. This stream is consumed by the Tweepy module asynchronously and is acted upon by a callback class implementation called StreamListener. In order to process tweets we’ll have to implement this class. The following example implements the on_status method, and simply inserts the tweet into a database. For simple data collection purposes this should be adequate. Additionally we’ll display the tweet on the screen using the TextWrapper class for debugging and observational reasons.
class StreamListener(tweepy.StreamListener):
status_wrapper = TextWrapper(width=60, initial_indent=' ', subsequent_indent=' ')
conn = mdb.connect('localhost', 'dbUser','dbPass','dbBase')
def on_status(self, status):
try:
cursor = self.conn.cursor()
cursor.execute('INSERT INTO tweets (text, date) VALUES (%s, NOW())' ,(status.text))
print self.status_wrapper.fill(status.text)
print '\n %s %s via %s\n' % (status.author.screen_name, status.created_at, status.source)
except Exception, e:
# Catch any unicode errors while printing to console
# and just ignore them to avoid breaking application.
pass
Tying it all together
Finally we use the stream API and start capturing tweets. With all the framework put in place it’s a simple matter of setting a list of keywords to filter with and calling the filter method of stream.
l = StreamListener() streamer = tweepy.Stream(auth=auth1, listener=l, timeout=3000000000 ) setTerms = ['hello', 'goodbye', 'goodnight', 'good morning'] streamer.filter(None,setTerms)
Alternatively to get a sample of all incoming tweets (something like 1% of total) you can use the streamer.sample() method. The filter method accepts, with default API permission levels, up to 400 keywords to filter on. The first parameter accepts a list of interesting people to follow. Using both parameters together results in an OR of the terms.
After running your stream listener for a few days you’ll have more than enough data to do some natural language processing on the data using the nltk!
.Net 4.0 Supports Tuples
A quick read on how the .Net team implemented the Tuple data type, my new favorite .Net structure. It’s very interesting to read how these guys go about implementing something like this:
Building Tuple
The Reverse Geocache Build Log – Part 2
Just to give an update on this project, the other day I finally got my hands on a mill machine and milled the box to the appropriate dimensions.
The cuts turned out really nice. The next step is to varnish or paint the box, the wood is starting to dry out and this will lead to cracking. I moved to a smaller button. This button has the advantage of being of a small enough profile to fit entirely within the lid, giving me the chance to perhaps cover all the electronics in the top.
I’ve also headed down the path of writing some geolocation code for the Arduino. Once it’s done, or at least in a rough form, I’ll post it on GitHub for consumption.
Announcing Kablamo – High Performance Data Collection
With the proliferation of high bandwidth wireless sensors in modern life the proper design of a mechanism to store and render large sets of data with a high rate of throughput becomes critical to ensure these high resolution sensor data overlays can be effectively utilized to further environmental and scientific goals.

To this end I have started building a Java-based server for high speed data collection from large numbers of sensors. The server is called Kablamo, currently hosted on Google Code here. The goals of this project are to design a system that scales elegantly, handles large numbers of concurrent connections, and provides a rich API for developers, while also utilizing something very basic for communications from embedded devices.
Currently this project is in pre-alpha stages. The code isn’t completely functional yet, but so far Grizzly has been successfully implemented to leverage the power of NIO, and I’ve also checked in a test application written in C# to test the basic implementation of the server. Next comes setting up Cassandra and writing code to interact with the database driver.
It’s a very cool project, the idea is to create large, scalable networks of context-aware sensors, and provide that data in a rich API / web-based interface for consumption. This is a big problem, handling large numbers of devices is really a non-trivial problem.
Manually Controlling Arduino I2C Lines
Recently I found myself working with a device that requires a pulse on the I2C SDA line to wake it up, before sending the first command. I was interfacing with this device using the Arduino platform to get up and running and found myself at a loss when I dug through the Wire library for a way to manually toggle the line.
It turns out that as long as the Universal Serial engine has control of the I2C bus, there is no way to manually toggle the lines. The solution was to temporarily disable the TWI, toggle the line, and re-enable it. The code to perform this action is as follows:
void wakeSensor() {
// This command serves as a wakeup to the sensor
// You'll have the look up the registers for your specific device, but the idea here is simple:
// 1. Disabled the I2C engine on the AVR
// 2. Set the Data Direction register to output on the SDA line
// 3. Toggle the line low for ~1ms to wake the micro up. Enable I2C Engine
// 4. Wake a millisecond.
TWCR &= ~(1<<2); // Disable I2C Engine
DDRC |= (1<<4); // Set PC4 pin to output mode
PORTC &= ~(1<<4); // Pull pin low
delay(1);
PORTC |= (1<<4); // Pull pin high again
TWCR |= (1<<2); // I2C is now enabled
delay(1);
}
Hopefully someone finds this function as useful as I have! It will have to be modified for your specific microcontroller, I’m using an older Duemilanove board so your mileage may vary.
Quick thoughts on Strong AI
With IBM’s Watson in the mainstream news, ushering in a new age of AI, it has started to really push up against the questions of what exactly AI is, what it means to us, and when it will come our way. Traditionally I’ve always imagined AI as an all or nothing discovery. One day I imagine we don’t have AI, and then someone makes a big breakthrough, and the next day machines are writing poetry, building better machines, and generally working autonomously to enslave the human race.
Read the rest of this entry »
Installing Glassfish and mySQL on a Ubuntu 10.1 VM
I use an Ubuntu virtual machine with Sun’s excellent VirtualBox utility. I want to develop Java servlets and to do this I need an application server that supports Java, after considering my options I decided on Glassfish with a mySQL database backend, but how do I install it? Easy…
Installing mySQL
This was dead simple:
sudo apt-get install mysql-server sudo apt-get install mysql-admin
The installer will prompt you for a root mySQL password, set one and write it down somewhere safe. After you’re done you can run mysql-admin to configure your server.
Install Java / Glassfish
Again, easier than expected! Navigate over to Oracles website and grab their Java EE development kit here to obtain Java SE 6, the JDK, and Glassfish at the same time.
