« Fun Friday Photos: Matt Williams & Amanda Watlington at eMetrics' Web Analytics Wednesdays Meeting | Main | Twitter As a Terrorist Tool? »

October 25, 2008

eMetrics: Search From Now On

By Li Evans

Emetrics_search_from_now_on_1 The last session of the eMetrics Marketing Optimization Summit in Washington DC that I attended before hitting the road and heading back north to Philly was on the Acqisition Track, entitled "Search from Now On".  None other than my great friend Mike Grehan of Acronym Media was presenting.  If you didn't know, Mike's writing his 3rd book on Search Marketing, as well as a white paper about search engines and their new listening signals.

A Bit of Search History

Mike starts of by showing a slide of a quote from Vannevar Bush, then summarizing what was on the screen as "information can become lost and all over the place and it would be great to put it all together."

"As We May Think", is a piece that Vannevar Bush wrote that questions, instead of making the weapons of mass destruction, couldn't we instead create something great for mankind? 1945 Bush invented the fax machine, computer and the internet, MEMEX - is really the world wide web.

Bush argues that as humans we should turn our scientific efforts from increasing physical ability too making all previous collected human knowledge more accessible. Now take a look at Google today.  Google's mission is to organize the world's information to make it universally accessible and useful.


1989 Sir Tim Berners-Lee actually created the internet World Wide Web (I actually made a mistake here, Mike said World Wide Web, not internet).  "I just had to take hypertext idea and connect it to the Transmission Control Protocol and Domain Name System ideas and ... tada .. the World Wide Web." In a space of 10 minutes he invented the world wide web.

Information Retrieval on the Web - Phase One

Data collection carried out by web crawler assigned to download web pages and parse text into index.  With this method though, it's difficult to tell which pages are most authoritative documents in a corpus of millions purely by analyzing the similarity of text.  Too many documents are relevant to the query and thus, creates what's know as the "abundance problem".  Then there's the issues of it being way to easy to manipulate the results.

Take the following example:  If a music student writes a paper on Beethoven's 5th symphony, and a conductor does, who's more relevant?  The conductor obviously, but by just looking at the text, there's no way for search engines to really tell.

Information Retireval on the Web - Phase Two

1998 John Kleinberg did a search on "search engines" on Alta Vista, Alta Vista didn't show up in the results.  Then he went and look at "Japanese automotive maker", neither Toyota, or Honda appeared in those results.  He wondered why and found that in either case none of them had those words on their page.  This is when he realized the words were important.

Network theory applied to link analysis provides major new signal based on hubs and authorities.  Google develops PageRank based primarily on citation analytics (a subset of network theory). Link anchor text provides context for latent semantic analysis.  However, the ranking mechanism is biased towards web content creators and not end users.  Also possible, as with crawls analyzing content, to artificially inflate link data - easy to manipulate.

 About 1993-1995 everything was based on links.  Mike wasn't even looking at words on page.  Just the words that are in the link.  It became not about the quantity, but then it became about the quality. Link building is about getting great links from within your community. The strongest signals search engines looked at, up until recently, were based on text on a page and links pointing backwards and forwards.

The Taxonomy of Search


Knowing the taxonomy of search is very Important for doing keyword research, and understanding the intent of the keywords being used.

  • Informational
    This applies to the surfer who is really looking for factual information on the web

  • Navigational
    Navigational is when a surfer really want to reac a particular web site.

  • Transactional
    Transactional means that ultimately the surfer wants to do something on the web, through the web.  Shopping is a good example.  You really want to buy stuff.

Understanding the user intent is most important when it comes down to it.  For example a bank looking at keywords thinks "Lend Money" is their most important keyword since that's what they are in business to do.  However, that's not how the end user sees it, the wan tot "Borrow Money".

The New Search Signals

Early information retrieval techniques limited to two major signal:  Text and Links.  It's very susceptible to dubious intentions of content creators.  As the web grows exponentially, more content is created than can be collected by a web crawler, there for too many relevant pages are outside of the scope of the search engine crawlers.

Most searches are non-commercial, for example a search on "History of Cookies" gives back no ads.  But take a look a search since Google implemented Universal Search over a year and a half ago.  There's "Vertical Creep" which drops the natural search results below the fold, and making "in the top 10" not matter any more. "Ranking reports" don't really matter any more, especially if you are now below the fold!
Universal search has changed everything, the "golden triangle of search" is changing.  The minute images are put in the results, the eyes dart all over the place.

Emetrics_search_from_now_on_2

Social mediais a new signal for search engines to learn relevance. Text book SEO is going to eventually disappear, crawling the web will become a back fill for the search engines. Text on an HTML Page, linkage data and link anchor text, along with Social Media - Tagging, Bookmarking, Rating, etc. are now becoming the signals, however, a new huge signal is the use of the Google Toolbar, and now the recently introduce browser, Google Chrome.

Since the rise of social media, content creators, such as copywriters for corporate web sites and online publishers, are outweighed by at least a factor of five by user generated content such as blog posts, forum posts, rating reviews, etc. End users want a much richer experience at the search engine interface, more color more images, more choice.

Connecting end users is with the content they're looking for may not be achieved as Google and other search engines have attempted  by creating the current signal database repository.  Too much information is now beyond their reach. New relationships between content creators and search engines need to emerge to cater to the demand for many different types of information that the end users crave.

Is HTTP the right platform?

Mike wraps up this session by questioning, is the current way we are viewing all of the information the "right way"?  He's said to me a few times in other conversations, "what we are doing now is like trying to shove a giant elephant through a tiny hole".  Changes in protocol, HTTP are going to have to come since  was built 20 years ago!

"As content becomes more diverse, more complex, bigger and more fragmented, getting it through HTTP and HTML may not be the right model anymore."  - Andrew Tomkins, Vice President of Search Research at Yahoo!

Mike's new whitepaper is coming out soon, "New signals to Search Engines, Future Proofing Your Search Marketing Strategy".

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341bfb1a53ef010535bd8034970c

Listed below are links to weblogs that reference eMetrics: Search From Now On:

Comments

Interesting information. This starts to get into the whole web 3.0 thing I think. The idea of the Semantic Web...according to many authorities the web is taking a turn for the better in that people want real, usable information that is easily found and searched.

Google seems to be at the forefront of this as well as others involving social networking and other social aspects of the internet.

It's no longer just about getting information though...it's about being able to interact with it in some way.

It's wrong to say that Tim Berners-Lee invented the internet, because DARPA actually did that in 1969 with ARPANET.

We can say he invented the www if you like - just not the internet. A lot of people contributed to that like Larry Roberts, Vint Cerf and Bob Kahn, Radia Perlman...

@CJ thanks for commenting. I made a slight mistake, Mike explained the Tim Berner's Lee created the world wide web, not the internet - as you said that was created by DARPA. I've fixed the above to reflect that!

Thank you :)

Interesting information about the new search engines. Google is the best search engine.
****************
jennifer
Social Bookmarking

The comments to this entry are closed.

Get SMG Today - Free!

Get SMG by RSS What Is RSS?
Get Search Marketing Gurus Today via RSS! Add to Google Reader or Homepage
Add to netvibes
Get SMG in Your Bloglines
Get SMG in Your NewsGator Online

Get SMG by E-Mail
Subscribe to SMG via Email
Enter your email address:


Delivered by FeedBurner

SMG Conversations

If You Like SMG Favorite Us on Technorati!
Add to Technorati Favorites
If You Like What SMG Has To Say, Joins Us At These Places!
Subscribe on YouTube to SMG's Videos
follow Li on Twitter
Follow Li on FriendFeed




Copyright 2006, 2007, 2008 SearchMarketingGurus.com