Therin, a long standing member of the community, asked me a question earlier that I thought was worth promoting to the main page. Her question was, "Why Google?"
You may or may not know that this entire project started with CanGoogleHearMe.com (http://cangooglehearme.com/chapter1.php
), a fairly enjoyable trip to California to sit on Google's doorstep.
As things have progressed with the project, that's a question I've been asked in private, but never answered in public.
I don't know if my reasoning for picking the companies I did is a credit to me or a boon, but it does illustrate part of the ways that I think. My response to Therin is below:
There were three obvious companies that could have been considered for CanGoogleHearMe.com. Google, Amazon, and at the time, Microsoft. All three were actively involved in full-text book scanning projects, which was the prerequisite for being considered. All three were working with content from multiple publishers, which was also a tremendous advantage.
Of the three, Microsoft was considered, but certainly not my first choice. The flavor and history of the company just didn't fit the approach, and in the end, I didn't think Microsoft focused enough on their digitization efforts to really do the project justice. This was confirmed later on when Microsoft stopped their scanning project a few months later. And finally, I wasn't enthusiastic about working with or for Microsoft.
That left Amazon and Google. This is the one that seems to confuse most people. Why Google over Amazon? This is where the things become more detail oriented. A large part of it was that Google was facing copyright battles. At the time, the courts had just received a request from various publishers to order Google to halt all scanning projects for the time being. The publisher's argument was that because there was no justifiable use for the full-text content that did not fall inside the realms of copyright infringement, Google should not be allowed to scan books from libraries until after the courts determined what the content could be used for. As a response, Google stopped scanning in-copyright books for a few months, partly in order to avoid a full scale injunction.
The text analysis and recommendation system I was trying to put forward WAS something that could be done without exposing content to a user, one of the key objections publishers were making in the copyright battles. Because it produces transformative graphs and graphical representations, and is fundamentally a form of review, it offered fewer copyright concerns than the existing Google approach of exposing actual content. Additionally, it tended to lend well to the argument that it was providing value to the publishers without in any way degrading their ability to profit from their own copyright, which is another point in Fair Use law.
It was my opinion that, in facing a halt order from the courts for lack of any applicable use of the content that did not expose it to users, this approach gave Google a legitimate ability to claim real-world application that didn't contain nearly as many copyright concerns. Considering the resources that Google was currently putting into its scanning project, a full scale halt would have been a significant setback for them.
Additionally, I felt this technology had a fairly decent "middle-ground" approach for publishers - it had the potential to directly drive sales of books, one of the claims Google has routinely used to get partner publishers on board, and did so in a non-scary way for the publisher. The value add - the carrot - of participating in a full text scanning project that never gave away copyrighted material was stronger and less scary with this technology. I was also worried that eventually public opinion would turn against Google in the Google vs. Publishers argument, as I'd argue it has with the latest copyright battle and settlement challenges, if they didn't offer a progressive technology application that made it very, very obvious that what they were doing would help publishers directly.
Another consideration was Google's existing product lines. If you think about it, Google has a few primary product lines, and numerous smaller lines. Search, Video Search (at the time), Mail, Maps, and the like were their primary product lines. Documents, Reader, Finance, Scholar, and some of these others are less mainstream. And search still represented the biggest of the big, generating the vast majority of their revenue.
Of all their products, the one that was the closest product to move to primary status was Google Book Search. It's a darling of the media, close to the hearts of people that write news for a living. Yet it missed some key functionality. For one, its usefulness for finding fiction titles was very lacking. Keyword search works wonders for non-fiction titles, but works far less effectively for keywords in fiction. When searching for Stephen King's It, what keywords do you search for? Clown? Monster?
These keywords would either return the wrong thing (Clown), or is covered by existing genre filters (monster/horror). Therefore, full text search offered limited value to the fiction market. Yet if you look at the makeup of publishing, more than 50% of titles that reach the market each year are fiction. First, this left what I felt was a significant hole in their current offerings. Second, it also meant that Book Search didn't do what most people expected it to do in the first place, which was find their next Harry Potter. In general, Google book search leaves you lacking when looking for fiction.
There were many, many other considerations. Google is a technology company, and Amazon is more a product company that used technology. I thought that Google's history and company culture was well suited for a project like CanGoogleHearMe, and Book Search there was considered an important factor, but not their primary focus. It's very difficult for a large company like Amazon to change what is the life-blood of their company. In their books sections, recommendations are a large part of what they do. In that circumstances, it's hard to mess with what is working, even if there is an argument that things could be improved. There's a great deal of risk involved. With Google, the Book Search was not their life and blood, and they'd demonstrated a willingness to roll out progressive technology products that later get promoted to significant status. In other words, the culture at Google was better suited for a development destined to impact a significant product.
And another big part was the culture itself. From what I'd read about Google, the internal hierarchy of the company allowed information to travel up and down very well. Therefore, the blanket approach of trying to find one champion inside of Google who would connect to the goals of the project and carry a message upward were much more likely to be successful at Google than at Amazon. Google's structure allowed a single advocate inside Google to find me interesting, and pass it on until it connected with the right person to be taken seriously.
Not to mention that Google, being Google, was held in high public popular opinion. One which the public would respond to. I viewed what I was doing very much as... well, I used to paint it this way. Many people compared the quest as a David vs. Goliath thing, which it never was. Not to me, at least. I always saw Google as my ally, to whom I was trying to deliver a vital message. I see it, perhaps grandiosely, as arriving late at a battlefield with a powerful weapon that could shift the tide of battle. Except you've arrived on the wrong side of the battle, behind enemy lines, and your ally - where you're trying to deliver the message or weapon - is on the far side of an a ocean of obstacles. There's an ocean of forces aligned against your ally, who has built walls and defenses to protect itself. Only these walls now keep you out as well, and you have to figure out a way in. The goal is not to beat the ally, but to join them.
In some ways, it sounds very Lord of the Rings, doesn't it?
I'm aware of the potential cheese that statement contains, but it does describe the perspective I tried to adopt. Not as a conqueror, but as an ally that just NEEDED to make it through, to be heard.
It was important to me that my trip was received as an, "I come in peace." And Google was the one I thought would receive it that way the best.
I'll cut it short there, because there's a lot more, many of them equally significant. The technology application to many fields, not just book search, made development paths seem obvious. The size of the targeted corpus. Many things. Before writing up any documents for my initial meeting, I'd broken down every positive I could think of for the company - including positioning and such - and wrote it up into a multi-page document for my own reference. A large part of my job, once I made it in the door, was to illustrate that these things were a good fit.
It was the only thing that kept me from just being some nut off the street.
Covering that entire document would probably be more detail than you're looking for.
Hope that was interesting.