In my last post, I explained what truly went on in the Twitpocalypse, which dealt with tweet IDs passing the limits of the 32-bit signed integer (from -2,147,483,647 to +2,147,483,647), which is the most common datatype in use in computer applications today. Yet, whilst computer science has limits embedded into its structure, Twitter does not – the number of users and tweets is growing, and FAST! In September, the Twitter API team estimates, we will pass the limits of the 32-bit unsigned integer which is up to 4,294,967,295. Thus, applications and libraries that utilise that datatype in conjunction with tweet identification numbers in the Twitter API will fail, once again. In the post where I explained the first Twitpocalypse, I recommended that all applications switch to the 64-bit signed (or unsigned even) datatype, and then maybe even consider storing the IDs as strings, with variable length (no arithmetic between IDs is really needed, so this should be extremely easy). The Twitter API team is doing the same thing – it has now informed developers that it is best to switch to a 64-bit implementation for storing those integers. Hopefully, this time, we will be prepared for the upcoming Twitpocalypse, and even less confusion and problems will arise when we pass that huge number of tweets. I myself am switching to a string implementation for the IDs in the Twitter client that I am developing (more info coming soon!), so that no matter how many tweets have been posted, the application will not fail. Good luck, everyone, in surviving this next round of this problem. [@twitterapi 's tweet; Twitter API Announcements (via TechCrunch)]
Tag Archive: signed integer
The Twitpocalypse Returns
An Explanation of the Averted Twitpocalypse
Numerous friends have asked me about this whole Twitpocalypse thing, as they just don’t get it. “What’s this whole deal with signed integers?”, they ask. Well I’m here to clear that up, as I do programming, and in programming, if you don’t know such basics of computer science, you’re doomed.
What the predicters of the Twitpocalypse were referring to when they mentioned the limit of 2,147,483,647, is the limits of the signed integer data type. The signed integer is a 32-bit data type, so it can hold only so large (or so small) a number – in this case, the highest number the data type can preserve in 32-bits is 2,147,483,647, and the lowest number is -2,147,483,647, respectively. This is what causes the Twitpocalypse problem, as well as the foreseen Unix Year2038 problem (where Unix epoch time will cross the limit, and distros using the 32-bit data type will fail.)
Each tweet in the Twitter infrastructure has its own numeric ID. Thus, programs that invoke the Twitter API’s ID interface into a 32-bit variable fail, as the number of tweets has passed the limit. When this happened, some applications failed, others continued working, depending on their implementation of this feature. Of course, updates were quickly rushed to the failing products, so that users wouldn’t be discouraged or affected for too long. Most applications now use the 64-bit integer datatype (commonly referred to as long or int64) for the ID variable, which supports numbers from −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 – in other words, pretty large. Until then, we’re safe, but if we manage to cross that 9 quintillion mark (well, it’s certainly possible!
), we may be forced to convert to the 128-bit integer time, or even storing numbers as strings (the datatype for any types of characters, mostly letters), but I think computing will have changed a bunch by then, so that we might not even think of such a problem. That is, if Twitter lasts till then, or maybe the world will have been sucked into a black hole. We’ll see.
(On a side note, last year, I created a VeryLargeInt class for storing huge numbers (100 digits long can be stored with this, or even more!) by using the string data-type, and I also built algorithms for computation with strings, including most common mathematical operators. I may publish this project here in the future.)
