Choose fontsize:
Welcome, Guest. Please login or register.
Did you miss your activation email?
News
BookLamp.org is open, and we need your feedback!
 

BookLamp Links:

Return to BookLamp
Member's Home
Forum Home



Pages: [1]
  Print  
Author Topic: So, Really, How Hard Can Book Recommendations Be?  (Read 269 times)
0 Members and 1 Guest are viewing this topic.
Aaron Stanton
Project Manager
Core Team
*****
Posts: 280



WWW
« on: July 01, 2010, 08:44:11 pm »

It's easy to forget just how complex making a recommendation to someone can be.  Not just book recommendations, but recommendations of any kind.  What sort of food will they like?  What sorts of movies?  How about music?  There are so many variables that can influence a person's preference that it's extremely difficult to decide which to pay attention to.  And to make matters worse, it's different from person to person, so solutions don't always carry over to the crowd.

I often make the comparison between what we do and what Pandora.com does, largely in the sense that we use a "book genome" style approach in measuring the contents of a book and matching it to a user.  But to be absolutely honest, there's a world of difference between analyzing a song that lasts for, likely, about four minutes, and a book that will take a human hours to read.

For one, a bad music recommendation will likely be caught by the user in the first 30 seconds of the song, and at worst, you've cost them roughly four minutes of their lives if they're very persistent and don't turn it off halfway through.  In a book, the user won't figure out they've been mislead until after they've gotten back from the library, bookstore, or favorite electronic distribution point, and have wasted at least an hour of their time.  People tend to be less forgiving in that circumstance.

Books are time consuming.  With a song, at least, you can get a 100% sample in a short time, and judge the accuracy fairly straight forwardly.  Books are not this way.  Books have a lot more room for many, many elements, and those elements can vary widely across a book in a way that vocals in a song probably do not.  For example, you may have two or three soloist singing in a song, one after another, but you're unlikely to have 12 of them, and when they show up, it's unlikely that one will want to sing on one tempo with a violin, and another only A Cappella.  In contrast, it wouldn't be strange at all for a book to have twelve characters, each with their own voice, style, vocabulary, perspective, and setting.  Sometimes one character will be in a fantasy world, and another may be in the future, all in the same book.  Smiley

It makes book recommendation both a more challenging and a far more interesting problem to work with.  To be honest, I love the issues that come up when trying to figure out the right balance.  For example, not only do we have a couple hundred average variables from each book that we measure, but we have them for each scene.  So I ask you this, what's the most important aspect of the book to be focusing on when you're suggesting a book to a reader?  The first chapter?  The first third?  All of the book?  The ending?

There's a concept that we spend a lot of time working with here at BookLamp that we refer to as "Perception of Accuracy vs. Actual Accuracy".  You want both of these to be high, really.  Perception of Accuracy is the response you get when you hand a book to a reader, before they've read it, and then ask them if they find it interesting.  Now, at that point, no matter how good the inside of the book may be for that person, they'll never read it if they don't perceive your recommendation as being accurate.  If it's not the right theme, genre, book cover, whatever it may be; if they're not interested in picking it up, well, you've failed.

Actual Accuracy refers to the user's opinion of the book AFTER they've read it; this takes into account an entire different set of metrics, including all the things that are found only after you start reading: characters, writing style, subtle plot points, etc.

As I said, you want both to be high.  You want to hand a book to a person, and have them expect to like it (perceived accuracy) enough to give it a try, and then actually like it enough to return once they're done.

So in the above context, do you:
A) focus on the first chapters of a book, assuming this is the first thing a reader will see when they start reading, and therefore the most important.
B) focus on the entire book, assuming that shifts in style, rise and fall of action, and everything else is important, and hope that the reader survives the first chapter and makes it to the middle.
C) none of the above.

The answer, actually, is that it varies by user, and circumstance; you build systems to identify when one condition leads to a better result than the other.  All this to say that I'm amazed at the complexity of the BookLamp engine now days, only a tiny part of which makes it through to the public arena.  There was a time, at the start of this project, when the execution of our engine was entirely based on how my mind thinks.  Then we started getting people like Paul and Dan, whose background in mathematics, data analysis, and statistical systems are far greater than my own.  Add on the work of Dr. Jockers, our director of research, a linguistics professor at Stanford University that uses machine learning to perform author attribution, and what you have is a machine that performs at a higher level.

A level that I believe exceeds any individual's ability in the group, but is instead a more powerful system for the diversity of backgrounds that went into it.

The other day, we cleaned one of our main offices (something we do rarely), and in doing had to clean off a number of whiteboards.  I couldn't help but feel a little awe looking at all the graphs, charts, histograms of genre breakouts, predictive models, thematic analysis, heat maps, character tracking, and other random doodles we had spaced around the room.  I felt like I was watering down secrets of the universe every time I sprayed the Windex.

I can't help but be a little impressed at the amount of complex, yet elegant, thinking that this team is throwing at a problem of how to make it easier for one person to find a book they'll like to read.

And more so, to do it all without needing that reader to once understand any of the wizardry in the background that makes it possible.

Makes me proud that I helped start this.  Smiley

Aaron
Logged
Sarah Yon
Apprentice
*
Posts: 1


« Reply #1 on: July 28, 2010, 07:28:43 pm »

Neat
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by SMF 1.1.4 | SMF © 2006-2007, Simple Machines LLC