Monetizing Entities – a Flanking Maneuver by Google?

Now and then, I see comments from people expressing their doubt that Google Plus will survive long-term. Some of that’s driven, I imagine, by Google’s early failures in successfully building anything even remotely social. Maybe a little of it’s just sour grapes, too. But it got me thinking, which is generally a good thing. And this is what I came up with. Some of you may think it’s a crackpot theory… and you may be right. But it makes enough sense that I think it’s worth some discussion. So crackpot or no, here’s what I came up with.

Thngs aren't always what they seemFirst, think about this: Google is all about connections – and not just on G+, either. Search is about connections, too. From a top level, that boils down to how entities and attributes may relate to each other.

An entity can be a company, product, location, phone number, URL or an individual, among a great many other things. Attributes are simply that, qualifiers or modifiers of those entities that provide a more descriptive picture of an entity, often in relation to other entities. Using this information, entities can be mapped, according to their relationships to other entities.

The Knowledge GraphOne such mapping of entities and their attributes is the Knowledge Graph (KG). This graph provides Google with a wealth of information, which can be used in many, many different ways. Ad targeting is certainly a huge application of this information, but by no means is it the only use for it.

Gathering the Data

So how is this data gathered? There are certainly plenty of obvious methods of compiling data on entities. The Internet is a sea of accounts, usernames, company names, products… all one needs to is harvest them. And if there’s anything that Google kicks ass at, it’s harvesting data.

They’re not bad at acquisitions, either – Frommers, Picasa, Freebase, FeedBurner, Metaweb, postRank, Zagat, etc. – Data-mining For Dollars, anyone?

Unfortunately, although there are trillions of entities already to be found on the web, only a very small portion of them have had any attributes assigned to them, establishing their relationship to other entities.

A product may be identified with a company, a user with an employer, a person with another person… but for every entity there can be many attributes, and in order for the KG to be extensively usable, a lot more information must be plugged into it – both entities and attributes.

Which is where I would say that Google has been dedicating much of their direction for quite some time. Even before we ever heard the term Knowledge Graph, at least outside of the GooglePlex.

The Tools

First of all, we know that Google has several algorithms. However, I suspect that we’re not talking about a handful, or even a dozen or so. I suspect there may be nearly that many for Matt’s Spam Team, alone. The Search Team (no known as the Knowledge Team) probably has a lot more, and I think we’re still only scratching the surface there. I imagine that their sorting and culling of data, in furtherance of the KG is one of their largest collections of different algos. Because there are so many different uses for the data, in different fashions, I think their processing would be faster and more scalable (not to mention less cumbersome) if handled by many different algos rather than a single massive one.

That’s not really the point though – it’s still a massive undertaking, particularly if pursued strictly in a discovery mode. Having the data provided to them via semantic markup would be much more efficient and would speed up the growth of their graph tremendously.

Unfortunately, the utilization of semantic markup hasn’t really caught on en masse. It’s increasing, but not at the rate that’s needed. What Google needs to do is prompt wider adoption or find a better way to catalog the data themselves.

Laying the Groundwork

Promoting the use of rich snippets didn’t achieve the flood of adoption they may have hoped for, but they didn’t give up. Microformats, microdata and RDFa saw only marginal utilization. They expanded and refined the capability with their acceptance of Good Relations and, but those too, failed to create a stampede to implementation.

Their launch of Google Plus, although much more successful than their earlier forays into the social playground, was still limited in its adoption, initially populated mostly by SEO/SEM folks – the same communities that comprised the majority of the semantic markup promoters.

But was that a surprise to the folks in Mountain View? I think that’s debatable.

Authorship convinced a lot of non-technical people to sign up and tie their Google account to their blog, in hope of getting better visibility in the SERPs. But frankly, it was a cumbersome process, and Google is a lot better at search than they are at writing comprehensible instructions for non-techy types. No stampede, but still probably a respectable turn-out. They even added rel=”publisher” to the mix, for multi-author sites. Again, that seems to have helped, but no stampede on the horizon.

They later added a more simplified process, in which a person could sign up for authorship very easily as long as they had an email address on their domain. I have no idea how much adoption they saw via that tool, but I suspect it was respectable, too.

So a reasonable person might take off his tinfoil cap and think “What are they really trying to do?” Here’s some possibilities I came up with:

  1. Get G+ up to speed and start monetizing it by pushing ads at the users;
  2. Get a foothold in social just to keep Facebook from sinking its claws any deeper;
  3. Just use it as a means of incentivizing authorship and further semantic markup;
  4. Use it as a data gathering tool;
  5. Some other purpose, as yet unknown

I’ve heard several SEOs say that they think #1 of #2 are the most probable intent. Many more, though, seem to think that #4 is the real purpose behind G+, even if it is eventually monetized.

I originally leaned toward #4 myself, until very recently. But now I think it’s a two-fold plan. I believe it serves the purposes of both #4 and #5.

So Here’s my Theory on the Real Purpose of Google Plus

Let’s think about what Google might be really doing (while also seeking the Holy Data).

We’ve all seen numerous instances of Google trying to connect entities in the absence of any owner action. Sometimes they get it right, sometimes they don’t. I’m seeing fewer fails recently, in comparison to the period shortly after authorship launched, but then, I’m not paying quite as much attention to it as I was then, either. Maybe they’re getting better at it, maybe not. I suspect they’ve gotten better.

The point is, it was obvious early-on that Google wasn’t content to wait for linked data from users – it was also playing connect-the-dots on its own. But since the data being provided by users was already easily verified or discarded, why would they play around and risk search quality?

And that’s where my imagination got sparked. So here are some What ifs:

  • What if the intention all along was really just to push for amassing a comprehensive entity database as rapidly as possible via algorithms, and the user-generated input was secondary to that effort?
  • What if the algorithms created to find entity connections in the absence of user-generated are having their results tested against verified UG data in order to refine the algos?
  • In fact, what if they needed a large volume of verified connections against which to test their algorithms – more than they had available?
  • What if there was never any intention to monetize G+ as a stand-alone entity?
  • Finally, what if Google+ was intended all along to be nothing more than the hub of their entity database, with any collateral benefits just considered to be a bonus, and there was another purpose for that database beyond simply being able to better target users with ads?

Since G+ launched, we’ve all probably seen conjecture about how and when they’ll monetize it. My theory is that they won’t. Ever. In fact, I’ll go a step further and say I think that if they ever shut it down, it won’t be because it failed, but rather, because it served its purpose. And I don’t see that happening any time soon.

Why would I say that? Because I don’t think that G+’s purpose was ever more than to serve as a vehicle. A harvester’s a vehicle, right?

Given the demonstrated slow adoption of semantic markup and the immense value that a significantly large Knowledge Graph would provide to Google’s advertising business, it’s hard to imagine them not having considered it years ago.

And if they decided it was worth pursuing a build-out by their own efforts, what better vehicle could they ask for? On top of that, the incentives they created resulted in a dramatic increase of harvestable data against which they could test their algos and build out their graph.

Then, of course, if another justification for such a build-out suddenly appeared, one that provided opportunities that nobody had ever imagined, the ROI could go through the roof.

And if it really took off and gained enough popularity to become a major social platform (which may still happen), that’s a Plus too. (see what I did there?)

Hey… if it really takes off, maybe they will monetize it! Even a really tasty cupcake can be made tastier with sprinkles on top.

I don’t think any of that gets to the heart of their reasons for developing G+. I think the real motivation for that inter-relational database has an entirely different purpose. I think their primary motivation was born before 2010, having to do with being able to verify relationships between entities. I think it was needed in connection with NSTIC (National Strategy For Trusted Identities In Cyberspace). Google+ was probably primarily developed as a vehicle to enable the company to become a credentialed Identity Provider.

A goal which was achieved late last year, along with PayPal and Equifax.

So, that’s my crackpot theory. Only I don’t think it’s all that crackpot. A lot of money and effort has been put into Google+. Under Google’s normal monetization scheme, it’s a non-player – a drain on resources and funds. From what I’ve seen, aside from their charitable contributions, the Mountain View crowd isn’t in the habit of spending money with no ROI in mind.

Viewed alone, Google+ is a cost center, rather than a profit center. But you can’t really view much that Google does separately. They’re in the business of making money for their shareholders, which means that every expenditure faces a strenuous ROI analysis before the first penny is spent.

From that standpoint, Google Plus must either generate profits on its own, or it must contribute to the generation of profit elsewhere. While the data harvested via G+ certainly might enable them to improve their ad targeting, I doubt that alone would pass the ROI litmus test. They’re already pretty damned good at that. There has to be something else.

Which raises the question of what? If being an ID Provider under NSTIC was a major motivation for building G+, how could that be monetized? Charge the sites that receive ID verification? Charge users for being verified? Charge the government for providing the service?

I’m betting we’ll eventually see a price tag somewhere – the question is, where?

So what do you think? Am I way out in left field, or does my theory make some sense? Do you see Google’s coffers swelling as a result of their new role? If so, how?

If you’re curious about NSTIC (and you should be), here’s a little more info on it from Kristine Schachinger’s Dec. piece on it on Search Engine Watch. I’ll have a post on it up tomorrow there, as well.