Category: Programming


I’m tired of all the Twitter API applications that force users to enter their Twitter usernames and passwords to log in. Why? Well, because it’s just so insecure!

Many people realized this for web applications some time ago, so now, most Twitter web services use OAuth to connect with the Twitter API. But for some reason, this hasn’t been done with desktop apps: most of them haven’t embraced OAuth.

Of course, there may be less of a security risk of providing your username and password with desktop applications, as one might think that there’s less of an opportunity for the provider of the Twitter app to capture and keep the username and password.

Yet, there’s still a huge risk of packet sniffing!

As an example, I opened Seesmic for Windows and started Fiddler2 to capture my traffic. Here’s what I found:

GET https://api.twitter.com/1/account/verify_credentials.json HTTP/1.1
X-Twitter-Client: Seesmic
X-Twitter-URL: http://www.seesmic.com
Authorization: Basic [REDACTED]
User-Agent: Seesmic
Host: api.twitter.com
Connection: Keep-Alive
That’s the request itself. Guess what happens if you decode that authorization token?
Authorization Header is present: Basic [REDACTED]
Decoded Username:Password= maximz2005:[REDACTED]

Oh look, it’s my username and password! Well, I hid the password here so that no one takes it… :)

Yet, there’s nothing to laugh or smile about here. Traffic from desktop Twitter apps that request a username and password is out in the open.

This all shows one major point: we must embrace OAuth. The OAuth system uses application tokens: in this case, Twitter creates a token for the application, the application connects using the token, and, if the user, at any point, wants to remove the application from his account, the user can just revoke the token, effectively revoking access for the application.

Twitter realized that this system would greatly increase security. Now, they’re phasing out the HTTP Basic Authentication (username & password) connection to their API, with a cut-off date of June 2010.

If you’re a desktop Twitter app developer: do you want your users to be out in the open? Do you want them to feel and be insecure? It’s not a great idea, though many apps are still doing this. Please, use OAuth, for your own benefit, and for that of your users, too!

Update 05/30/2010: The Twitter API Team has now instated a cut-off date for HTTP Basic Authentication connections. By June 30, 2010, a month away, all applications that utilize the Twitter API must switch to OAuth. Thankfully, OAuth isn’t a hard thing to implement, so we can expect a lot of our favorite apps to switch before time runs out. See http://countdowntooauth.com/ for more information.

Every month, StackOverflow, the brilliant and completely open question and answer site for programmers, releases a Creative-Commons licensed dump of their database. You can download this huge dump yourself, but it’s typically a very large file, and then comes the hassle of getting the XML files into a database of your choice. Now, there is a way to get around those inconveniences, by simply bypassing the whole download procedure and querying the data dump in the cloud! That’s what StatOverflow is for. Just go to the sandbox, enter an SQL query, and voila! Also, there’s a  quite helpful meta.stackoverflow post which explains the anatomy of the dump, so that querying it is much easier. Enjoy!

In my last post, I explained what truly went on in the Twitpocalypse, which dealt with tweet IDs passing the limits of the 32-bit signed integer (from -2,147,483,647 to +2,147,483,647), which is the most common datatype in use in computer applications today. Yet, whilst computer science has limits embedded into its structure, Twitter does not – the number of users and tweets is growing, and FAST! In September, the Twitter API team estimates, we will pass the limits of the 32-bit unsigned integer which is up to 4,294,967,295. Thus, applications and libraries that utilise that datatype in conjunction with tweet identification numbers in the Twitter API will fail, once again. In the post where I explained the first Twitpocalypse, I recommended that all applications switch to the 64-bit signed (or unsigned even) datatype, and then maybe even consider storing the IDs as strings, with variable length (no arithmetic between IDs is really needed, so this should be extremely easy). The Twitter API team is doing the same thing – it has now informed developers that it is best to switch to a 64-bit implementation for storing those integers. Hopefully, this time, we will be prepared for the upcoming Twitpocalypse, and even less confusion and problems will arise when we pass that huge number of tweets. I myself am switching to a string implementation for the IDs in the Twitter client that I am developing (more info coming soon!), so that no matter how many tweets have been posted, the application will not fail. Good luck, everyone, in surviving this next round of this problem. [@twitterapi 's tweet; Twitter API Announcements (via TechCrunch)]

Numerous friends have asked me about this whole Twitpocalypse thing, as they just don’t get it. “What’s this whole deal with signed integers?”, they ask. Well I’m here to clear that up, as I do programming, and in programming, if you don’t know such basics of computer science, you’re doomed.

What the predicters of the Twitpocalypse were referring to when they mentioned the limit of 2,147,483,647, is the limits of the signed integer data type. The signed integer is a 32-bit data type, so it can hold only so large (or so small) a number – in this case, the highest number the data type can preserve in 32-bits is 2,147,483,647, and the lowest number is -2,147,483,647, respectively. This is what causes the Twitpocalypse problem, as well as the foreseen Unix Year2038 problem (where Unix epoch time will cross the limit, and distros using the 32-bit data type will fail.)

Each tweet in the Twitter infrastructure has its own numeric ID. Thus, programs that invoke the Twitter API’s ID interface into a 32-bit variable fail, as the number of tweets has passed the limit. When this happened, some applications failed, others continued working, depending on their implementation of this feature. Of course, updates were quickly rushed to the failing products, so that users wouldn’t be discouraged or affected for too long. Most applications now use the 64-bit integer datatype (commonly referred to as long or int64) for the ID variable, which supports numbers from −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 – in other words, pretty large. Until then, we’re safe, but if we manage to cross that 9 quintillion mark (well, it’s certainly possible! :) ), we may be forced to convert to the 128-bit integer time, or even storing numbers as strings (the datatype for any types of characters, mostly letters), but I think computing will have changed a bunch by then, so that we might not even think of such a problem. That is, if Twitter lasts till then, or maybe the world will have been sucked into a black hole. We’ll see.

(On a side note, last year, I created a VeryLargeInt class for storing huge numbers (100 digits long can be stored with this, or even more!) by using the string data-type, and I also built algorithms for computation with strings, including most common mathematical operators. I may publish this project here in the future.)

I’ve always wanted to experiment with creating “sessions” on websites to mimic real users browsing and using a site, and I’ve finally gotten around to a project that involves this: my goal is to “wrap” an online database that is queried through a series of HTML forms, returns a map, and then allows you to open a popup with an HTML table outlining all the points displayed on the map. A formidable challenge, indeed, but in the awesome language of C#, no big deal.

To successfully establish a user session with a website, you must preserve the correct authentication and session cookies. The web database was built using ASP.NET, so I needed to make sure that I had an ASP.NET Session ID. One of the best ways to find out where cookies are sent to you is using an HTTP proxy to analyze requests and responses. The program of choice is called Fiddler, developed by Microsoft. (You can get the latest version at http://fiddler2.com/.) Unfortunately, I was forced to use Firefox to track packets because the old (straight out of 1997) interface  recognizes the new and revolutionary browser Google Chrome as being Safari 1.3 (how sad), and doesn’t allow you to use their database. and you must set special proxy settings in Firefox for Fiddler to function correctly. After visiting the site and filling out the series of HTML forms, I could see how the report page was functioning – it was using the ASP.NET Session ID along with referrer HTTP header attributes to  find the parameters of my original query and retrieve information from the database.

That might sound like an easy interface to hack, but the cookies are deceptive as you don’t exactly know when they are set. Fortunately, after scouring the internet for an hour or so, I happened to find out about the CookieContainer datatype (HttpWebRequest.CookieContainer). All you have to do to create a cookie “jar” that carries over from one HTTP session to another is writing the following code:

CookieContainer cookieJar = new CookieContainer();
var request=(HttpWebRequest) WebRequest.Create("http://google.com");
request.CookieContainer = cookieJar;

Now you can access the cookies by looping through the CookieContainer in a simple foreach statement. To apply this cookie container to your next HttpWebRequest, all you have to do is:

 request.CookieContainer = cookieJar; 

Yay!

If you’re a .NET developer, admit it: you’ve at least once felt curious about how the underlying APIs behind the .NET Framework are coded, how certain functions are implemented, or whether one core function is more efficient that another. Well, this is where the trusty program named .NET Reflector comes in. As the title suggests, Reflector uses .NET reflection and ILDASM processing. It is the decompiler of the .NET world, and allows you to view source code of DLLs or executables in multiple languages (including C#, VB, C++, Delphi, IL, and more) with ease. It is very interesting to examine how the brilliant programmers at Microsoft sculpted the inner workings of the programming framework that drives many applications today. With this tool, you can accomplish just that and much much more. The Reflector was originally created by Lutz Roeder, but has been acquired by Red Gate Software. The program is free for all to download and use, and it can even Reflect upon itself! Now that is true irony. You can snag a copy over at the Red Gate site.

ThinkGeek has updated the warfare category of their catalog with a new USB-powered missile launcher. Its perks: it has a built-in laser to guide the missiles, it has a very powerful software frontend, and most importantly, it offers an API!!! Finally, developers can terrorize others by coding applications that make use of the firing capibilties of the launcher. The API is built in .NET, so using it is a blast – a sample application is included, complete with raw C# code. And just because of said API, I am willing to slop down 40 bucks for this widget. You can find out more about the product and download the API for free on the product page.  It’ll be interesting to run the DLL through .NET Reflector, and if I find anything intriguing about the way it’s built, I’ll post it here. Have fun terrorizing people! :)

Copyright © 2010 Maxim Zaslavsky. All Rights Reserved.

Videos, Slideshows and Podcasts by Cincopa Wordpress Plugin