Google Search Problems ＆ Alternative Search Engines
There seems to be lots of heat on how Google search engine quality is going down. For example:
- Ask HN: do you feel Google search result quality has gone down? Source news.ycombinator.com
- Giving up on Google @ http://robsheldon.com/giving-up-on-google
- Giving up on Google @ http://news.ycombinator.com/item?id=1688588
- Google is turning into Cuil. Since when did quotes mean “Change everything in the quotes or ignore it freely”. Makes searching for any programming related stuff hell. @ http://www.reddit.com/r/programming/comments/9s6p2/google_is_turning_into_cuil_since_when_did_quotes/
I feel the same. Though, i am not sure that other search engines are better. In this or last year, i've occasionally tried Microsoft's revamped search engine “bing.com”. I don't think it's better. Though, it is true that i am not able to find info i want on Google as compared to say early 2000s. There were quite a few times in recent years that i spend 30 min or hours and unable to find info that i think should be there. Often, Google will show results from a bunch of large-scale commercial sites that swipe info from other online forums and put on their site, with bad formatting, and with many disruptive, annoying ads.
I think a large problem is SEO, and Google actually encourage it, even to the degree of officially supporting the term and practice (Google has a huge amount of documentation on how to do SEO. See: Why Does Google Give SEO Advice?)
Since about 2006, when i need some info, my first stop is to find it in Wikipedia, then possibly follow links from there.
Alternative Search Engines
I've also noticed quite a few up-coming search engines in this year. Here's a list of alternative search engines i've used.
- http://blekko.com/ (new)
- http://www.bing.com/ (Microsoft)
- http://a9.com/ (Amazon)
- http://duckduckgo.com/ (new)
- Yahoo! News at http://news.yahoo.com/
- Bing News at http://www.bing.com/news
- Google News at http://news.google.com/
Blog search: http://blogsearch.google.com/ (i never find technorati useful.)
For some juicy info about search engines, see: Web search engine. Also, checkout this Wikipedia List of search engines.
Google Search Sucks
- Search Still Sucks By Michael Arrington. @ http://techcrunch.com/2011/02/12/search-still-sucks/
- The Dirty Little Secrets of Search By David Segal. @ http://www.nytimes.com/2011/02/13/business/13search.html?_r=1&pagewanted=all
The NY Times is well-researched piece, written for the general public (as opposed to those of us familiar with the subject). Here's some juicy bits for those of us in the SEO community:
- J C Penney was ranked as number 1 result in Google search for lots of keywords for the past several months.
- There are 2k+ paid links from random low-ranking sites to J C Penney.
- http://www.opensiteexplorer.org/ is a online tool specifically designed to find backlinks of any website.
- Google took action against J C Penney, AFTER this article is published.
- J C Penney was using searchdex.com for SEO. They are now fired.
- J C Penney denies knowingly using paid links.
- Chitika (a online ad broker)'s Daniel Ruby, reports that being #1 on Google search results gets 34.35% clicks, while being #2 gets half as much clicks. See: 〔The Value of Google Result Positioning By Daniel Ruby. @ http://insights.chitika.com/2010/the-value-of-google-result-positioning/〕
- One of the paid link to Penney was by a site cocaman.ch. The guy discloses detail about how and why. (He was using TNX.net link broker) See: 〔“The Dirty Little Secrets of Search” – Additional Information By Corsin Camichel. @ http://cocaman.ch/wp/2011/02/the-dirty-little-secrets-of-search-additional-information/〕
- AT＆T, eBay, J C Penney are among the largest advertising clients of Google. JCP spent $2.46 million a month on paid Google search ads (not related to the backlinks.)
- European Union is investigating possible antitrust abuses by Google.
- Google, as usual, categorically denies that business relation with Google will benefit in search results. (which i do believe. See: Google Ice Cream; Can Google Be Trusted?)
Example of Link Gaming
Here's a example of a site with irrelevant, random, embeded links.
The site is 〔http://www.dcphpconference.com/〕. The site is a copy of a article from Wikipedia about PHP programing language, but added tens of irrelevant embeded links in random places that don't make sense. Essentially, any human reading that page will immediately see it's nonsense.
Note that the site is Google ranked at 5, which is fairly high. (you can see the ranking on the upper right corner in the screenshot above) Though, its traffic rank by Alexa.com is at: “1,945,674”. This means nobody's reading it. Alexa reports 50 links to the site. (which is high)
To give some context, good articles on my website xahlee.org is Google ranked at 5 at most, and my site has about 7k visitors per day, and is ranked by Alexa at about 70k world wide.
Google's Matt Cutts said the site will be taken off from its index. (Also, the site has changed as of )
Demand Media Content Farm
Wondering why Google Search is getting worse and why there's so much lousy written crap on the web? Thanks to, Demand Media. They are so-called “content farm”. Hire cheap writers, pay them some like $15 for few hundred words, then spam the web. Churning out 5k articles per day. What are some of their sites? eHow.com , Answerbag.com , Livestrong.com , AllSands.com , WebGuru.com , happynews.com , writeforcash.com , ExpertVillage.com , essortment.com ….
〔Demand Media's Planet of the Algorithms: Fresh off its IPO, Demand Media is blanketing the Web with answers to millions of questions you didn't know you had. Is that a business? By Felix Gillette. http://www.businessweek.com/printer/articles/54584-demand-medias-planet-of-the-algorithms〕
Distributed Open Source Search Engines
Nathan Staudt wrote to inform me there's
http://www.peer-search.net/ that he created. It's actually interesting, and first time i heard of the concept of peer-to-peer search engines. It's based on
Yacy, which is a free distributed search engine.
De-centralized systems are in general more robust and evolves fast. This could be the future of search engines.
I tried peer-search.net on my name “Xah Lee”, and it's quite interesting. It shows many pages that contains my name, reminiscent of Google search results in the early 2000s. In the past, Google search is quite useful in searching pages that contain some exact string. For example, if i want to know all pages that i left a comment, or sites that collect online forum posts, or any page that mentions my name, Google search in the early days will show them. This exact string can be a computer error message, a product model name, or a phrase in a chat room that you are trying to locate. Quite effective. But today, Google search has “smartened up”. It no longer is able to show pages you want where a exact phrase shows up, even if you double quotes your phrase "like this". But, it tries to interpret your search as if a human is asking a question about that phrase. For example, if i search “Xah Lee” now, i'll get pages that contain info about me, my website, my blogs, my Google profile, my Facebook, my Twitter, my yahoo etc pages. All the hundreds of other pages that exists that merely contained the string “Xah Lee” are not returned in the result, even if you went to page 20 or something.
Is this smattering up good or bad? It is probably more useful to majority of people most of the time. But it also made it hard in many situations where you really just need to find pages that contains a particular string exactly, as if searching a file system or database.