Fixing android-wifi-tether
As I read through the internet my number one complaint when it comes to all the wonderful work done by developers is the lack of documentation of the process taken to get to the end result. While I do love the fruits of their labor I'm so interested in the processes that led them there and what they learned along the way.
With that in mind, after recently working to help in a small way to fixing the wonderful android-wifi-tether application I'd like to share what I found along the way and the steps I took to get there.
Assessing the situation
The first thing I did was to troubleshoot what exactly the issue was. From the already existing bug reports I learned that there was some serious issues with communication and that they were solved by running a ping from the device to the laptop connecting. This immediately set me off, pings by themselves do not have magical characteristics so I fired up WireShark to troubleshoot. WireShark is a network protocol analyzer and allows you to monitor TCP/IP traffic going to and from your machine (and in some cases other machines).
In addition to this information I identified the chipset of the wireless driver, the bcm4329, and realized that many other phones had the same chipset. The HTC Desire was having similar problems so this obviously wasn't an isolated incident, however the Nexus One has the same chip but no problems. Because all these phones share the same driver much information was available.
Finally I took a look at the Sprint hotspot utility included with the phone. It wasn't having an ounce of trouble so this obviously was possible and implemented on this phone. I captured some kernel debugs and started hacking away at it.
The Art of Problem Solving
Next I had to bring all this information together and figure out what the heck was going wrong. I will let you in on the develop's secret weapon at this point: their notes. It is absolutely nessessary to keep good notes while debugging a large, complex problem like this one, else you will end up chasing your tail. To do this I use a combination of pen-and-paper and Microsoft OneNote.
Collect absolutely everything you figure out and refer to it often. This will save you much time and make life easier when you go to post on a newsgroup with your findings.
The Actual Problem
So what was the actual problem? After much chasing it boiled down to poor WiFi drivers. When the kernel module is loaded an optional parameter, firmware_path, is passed to it. The kernel will grab the firmware file at this path and load it into the broadcom chip for operation. This was done to ensure that firmware bugs can be updated out of the driver in future releases and to support the exact kind of dynamic loading behavior that we were looking for.
After installing tcpdump on the Android device and doing a capture I realized that the packets were not in fact making it to the device. Watching the traffic my laptop would repeatedly send ARP requests to the broadcast address but the phone would never show them, not even in the tcpdump. It wasn't that it was ignoring the ARP requests, it was filtering them on a very low level. Why this was happening is a mystery to me at the moment, although I have a feeling it has to do with power-management technology.
Here's the summary:
- Laptop sends broadcast packet to device (ARP or DHCP request)
- Device completely ignores laptop.
- Change laptop to static IP address, start constant ping on device to device
- Device ignores laptop 99% of the time
Obviously something was just blocking broadcast TCP packets from being received on the device. This to me seems like power saving technology hard at work.
That sucks, what a dead-end. The key came when I looked at the different firmware files sitting in the /etc/firmware directory. For whatever reason, HTC has included one labeled "fm_bcm4329_ap.bin". Now, I don't know where you come from, but in my world that stands for access point.
There was much finger-crossing as we loaded the _ap.bin driver into the kernel and tested android-wifi-tether. Lo and behold, it worked! In my mind we honestly got really lucky, the patch to the application was written in 30 seconds and the issue was grossly superficial. If HTC was trying to keep us from unlocking this feature they certainly didn't try very hard.
Next Stop: Infrastructure Mode?
The buzz has been that 2.2-Froyo is going to include built-in tethering support, and the world will rejoice with happiness as carriers eliminate their lucrative data plans. I don't see much changing, the carriers won't have to write their own WiFi apps but I'm certain they will want to monetize the feature. If what AT&T just pulled shows us anything it is that the carriers are not out to help us, they are out to nickel and dime us to death and they will continue to do so until someone comes along with a better business model.
I think there will be a continuing roll for hotspot applications in the future so I believe that taking the time to create code necessary for infrastructure mode sounds like a worthy investment. While examining the bcm4339.ko driver source code in the bravo kernel I've found some pretty interesting system calls surrounding SOFTAP mode as it is called. They are easy to trace by logging the kernel messages. The sprint hotspot application takes a completely different approach to initializing:
From here we can workout that the best way to get access point functionality out of this guy is to use the ioctl found in the bravo source code. The system performs a very poorly documented system call to SIOCSIWPRIV, an Android specific enhancement to the standard wireless drivers in Linux, and specifically calls a AP_PROFILE_SET from the pointer blob. This executes wl_iw_setap and from there the whole show starts.
The goal from here is to write a small C application that executes an ioctl to call this handle and initialize the AP hotspot mode, just as the sprint app does. Experimenting with iwconfig shows the _ap.bin firmware supports most of the functionality needed, but I think that in order to really knock it into AP mode the undocumented Android system call must be made.
Look promising? I'll keep you updated!


