tatsuya’s posterous

Social Relevancy Rank: What's Missing?

The future of search almost certainly involves social networks, social graphs, or social filtering in some capacity. Companies will live or die by whether they get the "social" part right: creating the right level of intimacy, trust, reliability, social connectedness, and accuracy in their results listings. Of course, this specifically means that their user experience must at least meet or, preferably, exceed that of Google's.

To achieve this, we must first stop arguing over the different flavors of search.

Real-time search. Social search. Semantic search. These distinctions are essentially meaningless, especially when we can't even agree on definitions and when each of their boundaries remain undefined. Instead, we should recognize that they're all part and parcel of personalizing and contextualizing search for individual users. Let's stop playing the "name game" and start thinking holistically about how each (and all!) affects and improves what we think of today as "search."

Because the promise of social network integration with search is a current favorite topic, we'll focus in this post on that: a class of social search. This is also a response to the ideas brought up by Alex Iskold in his post on the future of search.

Alex proposes that we rank search results by a kind of Social Relevancy Rank, first displaying results from friends and people whom we follow and later displaying results from "taste neighbors" and influencers, etc. FriendFeed already filters results by your friends' content first. Twitter's Trending Topics, by contrast, shows the crowd's perspective. While one's personal social circle could improve the relevance of some search results (and I noted some months back that this is a promising model), this type of filtering is more challenging than it sounds.

First, as Alex points out, "trusted opinions are scarce." Our friends couldn't possibly know everything we're interested in, and the smaller our social circle, the worse the problem becomes. Even with large social graphs, sooner or later we will undoubtedly search for a topic that hasn't been indexed in our friends' activity streams, and then we'll get few to no results and suffer an inferior user experience. We'd be better off turning to good ol' Google... the very thing we're trying to best!

Secondly, getting Social Relevancy Rank right involves a lot of insight into what users care about. Alex comments that, "This is not difficult for FriendFeed to do because... it knows who you care about." But does it? On FriendFeed, I follow only a limited number of the people I actually care about. Do those people alone account for the things I care about? And when I perform a search, does the engine know what I'm caring about at that moment? True, we have to start somewhere -- as PageRank did -- and tweak the algorithm over time. But suggesting that even a smart Social Relevancy Ranking is clued in to what we care about at any given moment is presumptuous at best given the state of the art.

Yet, having different levels of social relevance is a good theory, and Alex's demarcations are sound, in essence. But each level more likely indicates degrees of social proximity than relevance per se; although in some cases closer proximity may very well indicate greater relevance. The problem is that relevance is highly contextual. It depends on many factors, such as your profession, your search query, your friends, your friends' knowledge about those topics, and the information that is publicly recorded in their activity streams.

For example, a financial analyst (i.e. an expert) wouldn't care if her closest circle of friends was Twittering about how complicated a new tax code is. As an expert, she'd rather know exactly how the new policies affect an edge-case client of hers. Filtering search results by "friends and following" at one end and "the crowd in aggregate" at the other may fail equally in uncovering the right piece of information for her.

For general users, the "it depends" factor may be the urgency with which information is needed. When the need is urgent, people will actively search for the information (in any number of ways); other times, information may be welcome but only encountered serendipitously or consumed passively. Browsing feeds, Twitter posts, and Facebook streams are all passive ways of discovering information. Putting these activities on a continuum in which information search is active but information discovery is passive could look like this:

But to actually achieve a "Social Relevancy Rank," we have to consider how layers of social proximity map onto this search-discovery continuum.

When people actively look for a piece of information (e.g. the best Barbary Coast Trail guide for tomorrow's hike), they likely require trustworthy, high-quality information that could at least inform their decision. "Friends and following" could serve as a reliable social filter at this stage. But as the urgency subsides (e.g. just poking around for a mint julep recipe a week before a get-together), we relax our requirements and even welcome a wider set of results. At this stage, filtering results by friends of friends, influencers, experts, and even crowds in aggregate is appropriate.

Of course, serendipitously discovering information from "friends and following" would be welcome in other instances. So, to actually improve social relevancy in search engines and discovery services, there would have to be a distribution of acceptable social filters whose levels depend on how active the user is and what the user is searching for:

What this still fails to address, though, is how to assess the urgency of a user's needs or how to derive that level of urgency from the user's known behavior. This is a problem that engineers, designers, and HCI researchers have been struggling to solve for a long time (and a million dollars will get you only so far).

The problem of effective search runs deep. You can have all the flavors you want -- social, real-time, semantic -- and tomorrow's flavor will be merely another riff on the same tune. Yes, social networks and the social graph have the potential to meaningfully filter millions of otherwise undifferentiated pages of results. But words like "meaningful" and "relevance" are so contextualized -- varying as they do from user to user and usage case to usage case -- that they can't be expected to mean anything unless they are anchored by context. Mapping social proximity to users' active and passive information consumption could help us create more contextualized user experiences on the social Web, resulting in less time spent naming the latest flavor of search and more time spent actually improving search.

Guest author: Brynn Evans is a PhD student in Cognitive Science at UC San Diego who uses digital anthropology to study and better understand social search.

Loading mentions Retweet

Comments [0]

Shareflow: Focused conversations with people that matter | Zenbe

Loading mentions Retweet

Comments [0]

Multi-Platform Media Sync Software DoubleTwist Gains “Hundreds Of Thousands Downloads”, Is Now Available in Japan

DoubleTwist, a universal media management desktop application for Macs and PCs, not only has a clever marketing team behind it but also seems to be something a lot of people have been waiting for. The free software, which works like a multi-platform version of iTunes with a social networking component, has been downloaded hundreds of thousands of times since it launched in February (exact number aren’t disclosed for the time being).

Loading mentions Retweet

Comments [0]

Gnip - Delivering the Web's Data

Gnip is radically simplifying the way companies access and integrate the web’s data for use in social and business applications.

Loading mentions Retweet

Comments [0]

Hot, Hot, Hot! A Twitter Augmented Reality App for iPhone

Apple Doesn't Permit AU Apps

Unfortunately, TwittARound and all other augmented reality apps in development, won't ever make it to the iTunes App Store because they're built using non-public APIs. Officially, Apple's iPhone SDK does not offer access to any APIs for manipulating live video, forcing developers to use the available but unsupported ones instead. That's a shame because as you can see, there are a lot of unique concepts out there for implementing Augmented Reality on the iPhone.

In fact, this is one important area of development where Google's Android OS has the edge. Already, we've seen new AU Android apps like Layar come about - an app which could very well represent the future of augmented reality.

Even on the Nokia platform, AU is surging ahead. Earlier this year, for example, the creator of the "Heroes" TV show announced his upcoming AU app called "TEVA" which will use Nokia's video recording features in a new ARG mobile game.

Sadly, unless something changes at Apple, the AU developer community will simply move on to other platforms, leaving iPhone users behind. However, there's still hope. According to some hearsay out there, Apple is interested in enabling these types of apps. In other words, it could be a question of "when" and not "if." We only hope that they do so sooner than later.
For more on "twitter" see SocialTex

Loading mentions Retweet

Comments [0]

Safari Visual Effects Guide: Transforms

CSS transform properties provide powerful formatting possibilities without having to resort to using images or Flash. Using the transform properties, elements can be translated, rotated, and scaled in 2D and 3D space. Perspective can also be applied to elements giving a sense of depth to the way they are rendered.

The CSS visual formatting model defines a coordinate system for each element. This coordinate space can be thought of as being expressed in pixels, starting in the upper-left corner of the parent with positive values proceeding to the right and down. In the basic formatting model, it is possible to set only the position and size of elements. Using CSS transforms, you can position elements in 2D and 3D space.

For example, Figure 1 shows a simple HTML document rendered using CSS transform properties. The "Now Version 2.0!” element is rotated around the z-axis. The “Lorem Ipsum” element specifies a 3D perspective for its children elements. The containing text element is rotated around the x-axis.

Loading mentions Retweet

Comments [0]

All the Web's a Database: Yahoo Extends YQL With Insert, Update, Delete


yql_logo_jul09.pngLast October, Yahoo announced the Yahoo Query Language, a language similar to the popular database language SQL. Then, this February, Yahoo also announced its first major product that made use of YQL, the Open Data Tables, which allowed developers to create their own table definitions besides the ones already provided by Yahoo. As we reported in March, Yahoo then went ahead and extended YQL with YQL Execute, which gives developers even more flexibility and basically turns the web into a giant database that can be processed and mashed up with YQL. Today, Yahoo announced that it has completed its set of YQL verbs with three more functions (INSERT, UPDATE, DELETE) that now also allow developers to not just read and manipulate data, but also write data back to other services.

We talked to Yahoo! Chief Technologist, Sam Pullara, (@spullara on Twitter) and Jonathon Trevor, the product lead for YQL yesterday. They specifically stressed that Yahoo was trying to stay as close to the SQL language as possible, as this would allow the largest number of developers to make use of YQL without having to learn yet another new language.

The Read/Write Web

While the earlier incarnations of YQL were mainly meant to read data, with the addition of these three new SQL verbs, the focus has now shifted towards writing data back to the net as well. Developers can now use YQL to write and modify data on web services and applications.

To explain how useful this can be, the Yahoo team used a few different examples. A developer can now easily use YQL to update a Twitter account (even authentication with OAuth is possible), for example, or add a new comment to a blog post, or insert any data into a remote database. Basically, developers can now use YQL to write data back to any web site that uses forms for data entry and to any API, including authenticated APIs.

To try this, here is an example from Yahoo (you will have to log in to the YQL console):

Try creating a new tweet from the YQL console, follow this link <a href="https://developer.yahoo.com/yql/console?q=use%20%27http%3A%2F%2Fwww.yqlblog.net%2Fsamples%2Ftwitter.status.xml%27%3B%20insert%20into%20twitter.status%20(status%2Cusername%2Cpassword)%20values%20(%22Playing%20with%20INSERT%20UPDATE%20and%20DELETE%20in%20YQL%22%2C%20%22twitterusername%22%2C%22twitterpassword%22)">to run this</a>:

use 'http://www.yqlblog.net/samples/twitter.status.xml';

insert into twitter.status (status,username,password) values ("Playing with INSERT, UPDATE and DELETE in YQL", "twitterusername","twitterpassword")

Pullara and Trevor also stressed that because Yahoo runs YQL on five datacenters spread over three continents (three in the US, one in Europe, and another one in Asia), executing commands through YQL is generally very fast. Yahoo also set some relatively generous rate limits for the service. Developers who use the service and who identify themselves with an access key can make up to 100,000 calls per day, while anonymous users are restricted to 1000 calls per hour, which is still a pretty good number.

Loading mentions Retweet

Comments [0]

ReadWriteWeb Interview With Tim Berners-Lee, Part 2: Search Engines, User Interfaces for Data, Wolfram Alpha, And More...

In part 2 of my one-on-one interview with Tim Berners-Lee, we explore a variety of topics relating to Linked Data and the Semantic Web. If you missed it, in Part 1 of the interview we covered the emergence of Linked Data and how it is being used now even by governments.

In Part 2 we discuss: how previously reticent search engines like Google and Yahoo have begun to participate in the Semantic Web in 2009, user interfaces for browsing and using data, what Tim Berners-Lee thinks of new computational engine Wolfram Alpha, how e-commerce vendors are moving into the Linked Data world, and finally how the Internet of Things intersects with the Semantic Web.

Semantic Web and Search Engines Like Google, Yahoo

RWW: You've been talking about the Semantic Web for many years now. Generally the view is that Semantic Web is great in theory, but we're still not seeing a large number of commercial web apps that use RDF (we've seen a number of scientific or academic ones). However we have begun to see some traction with RDFa (embedding RDF metadata into XHTML Web content), for example Google's Rich Snippets and Yahoo's SearchMonkey. Has the takeup of RDFa taken you by surprise?

TBL: Not really, but the takeup by the search engines is interesting. In a way I was happy to see that, it was a milestone for those things to come out of the search engines. The search engines had typically not been keen on the Semantic Web - maybe you could argue that their business is making order out of chaos, and they're actually happy with the chaos. And if you provide them with the order, they don't immediately see the use of it.

"The search engines have not been keen on the Semantic Web [...] their business is making order out of chaos, and they're actually happy with the chaos."

Also I think there was misunderstanding in the search engine industry that the Semantic Web meant metadata, and metadata meant keywords, and keywords don't work because people lie. Because traditionally in information retrieval systems, keywords haven't proven up to the task of finding stuff on the Web. One of the reasons is that people lie, the other is that they can't be bothered to enter keywords. So keywords have gotten a bad reputation, then metadata in general was tarred with this 'keywords don't work' brush. Because a lot of Semantic Web data included metadata, then people thought that with Semantic Web data -- again, that people will lie and won't have the time to produce it.


Google rich snippets example; image credit: Matt Cutts

Now I think there's a realization that when you're putting data online, that people are motivated NOT to lie. For example when your band is going to produce its next album, or when your band is going to play next downtown, you're motivated to put that information up there on the Semantic Web. There's an awful lot of cases when actually data is really important to people; and it's on the web anyway. So I think it's great that some of the search engine companies are starting to read RDFa.

Does this mean that they [search engines] will start to absorb the whole RDF data model? If they do, then they will be able to start pulling all of the linked data cloud in.

"The web of linked data and the web of documents actually connect in both directions, with links."

Will they know what to do with it? Because when it's data in a very organized form, I think some people have been misunderstanding the Semantic Web as being something that tries to make a better search engine - i.e. when you type something into a little box. But of course the great thing about the Semantic Web is that you can query it, you can ask a complicated query of the Semantic Web, like a SQL query (we call it a SPARQL query), and that's such a different thing to be able to do. It really doesn't compare to a search engine.

You've got search for text phrases on one side (which is a useful tool) and querying of the data on the other. I think that those things will connect together a lot.

So I think people will search using a search text engine, and find a webpage. On the front of the webpage they'll find a link to some data, then they'll browse with a data browser, then they'll find a pattern which is really interesting, then they'll make their data system go and find all the things which are like that pattern (which is actually doing a query, but they'll not realize it), then they'll be in data mode with tables and doing statistical analysis, and in that statistical analysis they'll find an interesting object which has a home page, and they'll click on that, and go to a homepage and be back on the Web again.

So the web of linked data and the web of documents actually connect in both directions, with links.

User Interfaces for Semantic Content

RWW: At the recent SemTech conference, Tom Tague of Thomson Reuters' Calais project suggested that user interfaces for semantic content are key in getting more take-up. With that in mind, I wonder if you've seen some great interfaces or designs for semantic applications in recent months - if so which ones and why did they impress you?

TBL: I think that whole area is very exciting at the moment. The only piece of hacking I've done over the past few years has been on a thing called the Tabulator [a data browser and editor], which is addressing exactly that. Partly because I wanted to be able to look at this data. And now there are lots of different ways that people need to be able to look at data. You need to be able to browse through it piece by piece, exploring the world of data. You need to be able to look for patterns of particular things that have happened. Because this is data, we need to be able to use all of the power that traditionally we've used for data. When I've pulled in my chosen data set, using a query, I want to be able to do [things like] maps, graphs, analysis, and statistical stuff.


W3C Tabulator, a data browser/editor; Image credit: wiwiss.fu-berlin.de

So when you talk about user interfaces for this, it's really very very broad. Yes I think it's important. There's also the distinction we can make between the generic interfaces and the specific interfaces.

There will always be specific interfaces; for example if you're looking at calendar data, there's nothing else like a calendar that understands weeks, months and years. If you're looking at a genome, it's good to have a genetics-specific user interface.

"I want to be able to do maps, graphs, analysis, and statistical stuff."

However you also need to be able to connect that data, through generic interfaces. So if my genome data was taken during an experiment which happened over a particular period, I need to be able to look at that in the calendar - so I can connect the genetics to the calendar.

So one of the things I hope to see is domain-specific things for various different domains, and the generic user interfaces. And hopefully the generic interfaces will be able to tie together all of the domains.

Loading mentions Retweet

Comments [0]

Who Uses Social Networks and What Are They Like? (Part 1)

A new study by Anderson Analytics looks into the demographics and psychographics of social networking users on Facebook, MySpace, Twitter, and LinkedIn with a goal of providing marketers with information about users' interests and buying habits as it related to their their network of choice. The end result is a detailed look at the profiles and habits of social networking users on the web today.

Some of the study's findings echo things we've already heard. For example, Facebook users tend to be old, white, and rich. MySpace users are young...and fleeing. Other info is new: Twitterers are more likely to have a part-time job, LinkedIn users like to exercise and own more gadgets.

The Anderson study sampled over 11,000 GreenfieldOnline panelists (an online survey community) over an 11 month period to understand social networking services (SNS)'s reach and overlap among the U.S. Online Population. In May, the company surveyed an additional 5,000 panelists of which over 1,250 participated in an in-depth attitude and usage survey. They then grouped the participants into two categories: those that use social networks and those that don't. To be considered a social network user, the participant had to use one of the sites in question in the past 30 days.

Loading mentions Retweet

Comments [0]

Security Guru Calls Chrome OS's Security Claims "Idiotic"


Noted security guru Bruce Schneier, chief technologist at BT, has scoffed at Google's claims about its new OS, just announced yesterday. According to the Google blog post, Chrome OS represents a complete redesign of the underlying security architecture of the OS "so that users don't have to deal with viruses, malware, and security updates." A bold statement to say the least...and apparently one Schneier doesn't think too much of. "It's an idiotic claim," he says.

In a Yahoo News story, it's reported that Schneier isn't completely buying Google's promises. "It was mathematically proved decades ago that it is impossible -- not an engineering impossibility, not technologically impossible, but the 2+2=3 kind of impossible -- to create an operating system that is immune to viruses."

That seems to us like he's picking on the semantics of Google's statement just a bit. Google says that users "won't have to deal with viruses," and Schneier is noting that it's simply not possible to create an OS that can't be taken down by malware. While that may be the case, it's likely that Chrome OS is going to be arguably more secure than the other consumer operating systems currently in use today. In fact, we didn't take Google's statement to mean that Chrome OS couldn't get a virus EVER, we just figured they meant is was a lot harder to get one on their new OS - didn't you?

Even Scheier himself admits that an OS redesign which takes security into account "all the way up and down" could make for a more secure OS than the ones available today. However, that's different than saying that users won't have to deal with malware, he added.

Carl Leonard, security research manager of Websense EMEA, also shares Shneier's beliefs. "All software is susceptible to issues - it just depends on how much effort the malware author wants to go to and how much profit can be made, he said. "Already we have seen vulnerabilities and issues with the Chrome browser, and Google even ran a contest in which two well-known security researchers found 12 exploitable security flaws in the company's Native Client system."

OK, we get it: Chrome OS can get malware...technically speaking. But won't it get less of it?

Forrester Research analyst Andrew Jaquith, on the other hand, has more positive things to say about Google's new OS. He notes that the company has made strong security strides through its Native Client code technology and Chrome web browser, which includes features such as "sandboxing" which could help contain malware. "If [Google] brings that kind of thinking to the operating system and looks at it from a clean sheet of paper, they should be able to introduce some significant improvements," he said.

Do you think the security community is making a mountain out of molehill when it comes to Google's security claims? Or do you think they were right to point out that no OS is invulnerable to attack?

Loading mentions Retweet

Comments [0]