07-01-2008

Gnip is Ping Spelled Backwards

by Brad Feld

One of our recent investments, Gnip, has launched its first service. Gnip is all about making data portability suck less. Remember our friend the Ping server? Gnip is the next generation of it designed specifically for the many-to-many webservice-to-webservice world.

Data sharing between two services is no big deal. It happens every day using RSS, Atom, XMPP, and open APIs. Two years ago this was still a novel idea and the power of an API could project a company like Twitter into geek mainstream overnight since it was easy to write a web app that could read and write from it. Or you could grab someone’s RSS feed and go to town with it. A ping server was really helpful here – rather than having to constantly check everyone’s RSS feeds, you could check the ping server and when something you cared about changed, go to the RSS feed.

One-to-one wasn’t a problem. Many-to-one shouldn’t be a problem, but it started to be uncomfortable at scale. Web services that became popular overnight had performance issues, especially when their APIs were getting hammered. The solution for some was to simply turn off specific services when the load got high, or throttle (limit) the number of API calls in a certain time period from each individual IP address.

Hmmm. Many-to-one started to break at scale, but then it got worse. Everyone realized how important it was to have an API and allow for data portability. The many-to-one problem morphed into a many-to-many problem. With this, things started to get hairy. In addition to having to deal with performance and scale, every web service has its own special API format. If you were a new web service and wanted to talk to a bunch of other web services, you had to write a bunch of code – after understanding exactly what you wanted to do with each service.

Scale the number of web services linearly. Scale the number of users linearly. Scale the number of daily interactions linearly. You get a nice steep upward sloping geometric curve (the kind that VCs love). When the number of web services, users on a web service, or daily interactions on a web service grows geometrically, it gets really messy, really fast. Imagine Flickr, Digg, and Plaxo all trying to “talk” with each of the other’s services on behalf of each of their millions of users with no intermediary and you begin to understand the magnitude of the problem. Throttling back the number of API calls or turning off APIs isn’t the solution because it doesn’t do anything to fix the root cause of the problem.

Gnip plans to sit in the middle of this and transform all of these interactions back to many-to-one where there are many web services talking to one centralized service – Gnip. Kind of like a ping server. Kind of like a CDN (Content Distribution Network). But different. Yet equally important. Maybe more important.

Gnip’s first service is a free centralized callback server that notifies data consumers (such as Plaxo) in real-time when there is new data about their users on various data producing sites (such as Flickr and Digg). Gnip’s callback service accepts a variety of existing standards including XMPP, Atom, RSS, and REST and provides pre-written convenience libraries for the most common development languages (including PHP, Perl, Python, Ruby, and Java.) If you are a data provider, it’ll take you less than an hour to integrate; if you are a data consumer you can start receiving notifications within a day.

Gnip has a lot more coming that builds on this first service in their quest to make data portability suck less.   If you are interested in playing along, head over to the Gnip Community.