Tuesday, April 29, 2008

Doogle (Google + Digg)

The Internet is an unbelievably powerful tool. Never before in the history of human life has information been so universally accessible. What’s more is that we’ve also figured out ingenious ways in which to categorize and sort this inundation of knowledge. The Internet is probably flooded with millions upon millions of web pages each day. To give a minute idea of how large the Internet is growing, YouTube.com, a social networking site in which videos are uploaded by users, is estimated to have 825,000 videos uploaded daily.

This is just a single site in the huge expanse that is the Internet. Yet, rather than sputtering to a slow, painful death, the Internet collects and organizes this information promptly and logically. The primary method that has been developed in order to deal with categorizing the Internet has been search engines. Search engines have been around for quite a long time after I did some research regarding them.

I found out that one of the first search engines was called, “Archie,” and it was launched in 1990 by a student at McGill University (Source). The way that the search engine worked was by downloading various sites found on public anonymous FTP sites, or simply the Internet circa early 90’s, and indexed them. This created a searchable index of file names; however, it didn’t organize what was in those files. In later years, web search engines like “Aliweb” were developed in which people would upload sites into the database in order to make it searchable for Internet users (in ways, this is similar to Web 2.0 sites like Digg and Reddit, but those sites now have a rating system to eliminate the garbage from your search).

In later years, other search engines were founded that are what some may consider a regular search engine: AltaVista, WebCrawler, AskJeeves, and Excite. These systems worked very well in creating a portal to the hundreds of thousands of websites that you may have been searching for based on the frequency of the keyword that you were searching for. In the early days of Internet search engines, there was a very simple formula in how to catalog results. Search engines would organize web pages based on the frequency of the keyword you were searching for on that page.

Therefore, if you entered the keyword “dog,” the search engine would return results to you so that the page with the highest amount/frequency of the word “dog” on it would show up first, and so on and so on. This worked pretty well at the beginning and was one of the things that Yahoo! did incredibly well. This helped them gain a large share of search market. However, as the Internet steadily grew, people needed to develop better ways to organize their search results so that information could be found quicker, better, and easier. That is when Google developed its unbelievably brilliant, yet simple, formula. Larry Page and Sergey Brin, the founders of Google, developed an algorithm known as PageRank, aptly named after Larry Page.

I am going to attempt to explain PageRank as if I were talking to a little child, because there almost isn’t enough Internet to fully comprehend it. Quite simply, it uses the entire Internet to rank the validity of websites while also using the old keyword frequency cataloging technique. Google’s algorithm is essentially grounded on the theory of referencing and it is quite an old concept if you think about it. Google thought that if other websites referenced your own site (through means of a hyperlink), this essentially translates as a way of saying, “this site knows what they are talking about.

I believe in them and you should too.” So, the more websites that reference your website, the more people seem to agree that your information is reliable and warranted, and thus should improve your rating when searched for in a search engine. The reason that this is an old concept is because it is very similar to citations used in books or scholarly documents. If there are a lot of other sources that cite a particular source, it is generally accepted as something that is pertinent to that particular topic.

Think about physics. Tons of studies cite evidence relating to Albert Einstein or Stephen Hawking, because these guys have discovered information that is at the core of the subject. Google works in a very similar way. It uses the entire Internet to rank it’s catalog of billions of web pages. But just as Google improved upon a system that people thought was perfect enough as it is, I think that there is potential for improvement in the future.

My idea stems off of Google and Digg. Google is a brilliant search engine that works almost flawlessly, and Digg is a fantastic Web 2.0 site that uses the voice of Internet users to promote amazing things and sweep away stupid things. Ultimately, I propose a search engine that uses the algorithm that Google has developed, using a passive system of ranking websites as they do now through hyperlinks/referencing, and incorporating a voting system in which people can vote on how well the website they found when searching actually met their needs. This is a far more active and “Web 2.0” approach. I think that this could be unbelievably effective, because it can move that third option on the page to the top of the page, which ends up saving an enormous amount of time in the long run.

Testing for Realism:

  • Is it well accepted by a particular target demographic?
It can be. When I asked Yahoo! Answers what are some of the “barriers to entry when creating your own search engine to compete with Yahoo and Google,” the only response that I was given was, “well…you can’t even enter the market…” I found that quite amusing. It’s true that the search engine market seems quite saturated at the moment, especially when you consider that goliaths like Microsoft can’t even enter into the race between Yahoo! and Google.

However, in the late 1990’s when AOL was running the Internet, people laughed when they thought that in 10 years they would be using something completely different, especially a thing called Google. But the reality is, Google came out with a better product that provided people with what they needed faster. Slowly but surely, people started to sway away from their older technology and jump into something that truly got the job done. I feel this is and can be true with this sort of technology.

It simply makes for a more accurate search engine when you combine the passive and active roles of Internet users. That is one demographic that I can see switching over to this sort of search engine; the masses who want to use something better. That is a tremendous amount of people. Yet, consider further that there are on average 353,987 new Internet users per day. This may include people in developing countries who are using the Internet for the first time, young people who have never used the Internet yet, or older people who haven’t touched the darn thing their whole lives.

These numbers are complimentary of Google Answers (Source). These people, although they probably know about Google and Yahoo! are not entrenched in their Internet habits and if there is a better and more efficient technology, it’s likely that they will opt for it. Slowly, but surely, a superior product will gain market share. I don’t expect this to happen overnight or quickly, but rather over a couple years.
  • Does it fill a need?
Like all things on the Internet, it doesn’t really fill a need. The world existed well and just fine prior to the Internet, yet we have come enormously far since we started using it. Therefore, the direct answer to this question is, “no, it does not fill a need,” but the Internet is all about innovation and adaptation.

I would say the Internet is almost the personification of human ingenuity, creativeness, and improvement. Therefore, I feel as though we are doing the Internet and ourselves a disservice if we don’t continue to adapt and expand our possibilities.

We need to remember that we should always be striving to improve, and just because something works well now, doesn’t mean that it is going to work well indefinitely. We should always be conscious of moving forward and trying new things, because who knows what we can learn from it.
  • Can it be setup by an individual or at most small group of individuals?
This is absolutely true. It will take a small group of individuals with an unbelievable proficiency in computer science. I have to imagine creating an algorithm that works like that of Google’s to be quite sophisticated in order to create. That is merely a component of this 21st century search engine. Additionally, whoever designs this super search engine will have to incorporate a voting structure to accommodate people using the site. I envision this in the following way: The product would work something similar to the likes StumbleUpon or any sort of toolbar that you may have installed into your Internet browser.

In order to get active “votes” on websites by users, not only using the passive voting that the computer algorithm achieves, users will have to install some sort of toolbar into their Internet browser (better yet, the search engine could be its own Internet browser). The browser can be set up in a number of different ways. It can have three designs as I see it. The first would be a toggle, in which users can drag how effective the site was in giving them what they needed. They can move the toggle back and forth between 0 and 1000. If you got exactly what you wanted, put 1000, but if you didn’t like the site at all, give it a 0.

These tabulations would then be averaged out and combined with the passive ranking algorithm to give a middle ground of sorts. While it does this, the search engine will also have to remember the keywords that you searched for and make the ranking you gave unique to that keyword or string of keywords. For instance, you may search for “dog” and vote 750 on the first site that comes up. But, if you search for “dog collars” that exact same site might be a 100. The search engine will have to remember the phrase you searched for and connect your respective ranking to it.

The second method that you can use is a simple 1 through 10 scale in which you select a number and submit it in. This doesn’t allow for as much variation, however.
The final method would be to ask the person who was searching, “Did this website find what you were looking for,” and people can reply “Yes or No.” This allows for the least amount of variation, however, it still provides people will the ability to actively vote on how accurate a site meets their needs.
  • Can it generate income?
The search engine and Internet industry in general is huge and shows no signs of slowing down in my estimation. The primary stream of revenue that search engines utilize is through advertising, and I believe the advertising industry to be healthy as well as the Internet industry. According to HitWise.com, a company that monitors the usage of websites, they provide a list featuring, “the top 4 leading search engines based on US Internet usage, ranked by volume of searches for the 4 weeks ending March 29, 2008” (Source).

Essentially, these numbers provide a snapshot of the current search engine market share in the United States. According to them, Google maintains 67.25%, Yahoo! 20.29%, MSN/Live Search 4.88%, and finally Ask with 4.09%. This makes up 96.5% of the Internet Search Engine market in the United States, and I assume that the other 3.5% is made up by “bottom dwellers” of the search engine market. If we look at the revenues that these companies produce we see the following:

Google – 16.5 Billion or 245M/1% market share
Yahoo! – 7 Billion or 345M/1% market share
MSN/Live Search – 1.848 Billion or 462M/1% market share
Ask.com – 227 Million or 56M/1% market share

**Click the Image to see it more clearly**
It doesn’t take a statistician or mathematician to see that the numbers here are enormous. There is a great deal of revenue that can be made when entering the search engine industry, especially with such an innovative and fantastic product such as this one. The reason I provide the rate at which these companies produce revenue per 1% in market share is to provide a conservative estimate of potential revenue that can come from entering the search engine market, even in the case that you don’t topple the giants of Yahoo! and Google.

By looking at the top four companies in the Internet search engine industry, there is a mean of 277M for every 1% of market share that they capture. However, I don’t accept this as a conservative enough estimate, due to the fact that as a search engine gains popularity, advertising space begins to become more attractive and thus more expensive. This is well illustrated by the gap between Ask.com and both Yahoo! and Google. It is somewhat contradicted by MSN/Live Search, but I assume that Yahoo! and Google are going through diseconomies of scale (Source) resulting in increased per-unit costs.

After constructing a graph that illustrates the Billions of Dollars of revenue produced by each company measured against the percent of market share that each assumes, a linear trend line can be added. The linear trend line has an intercept of (0,0) because we assume that if you don’t enter the market you don’t make any money, but if you enter the market you will have a linear increase in revenue. The trend line has an equation of y = 0.2536x, which essentially translates as $253 million for every 1% of market share that is assumed. Therefore, I would make a conservative estimate that this search engine, if able to capture 1% of the Internet search engine market share, can produce revenue of $253.6 million per year.
  • Is it marketable?

This search engine is most definitely marketable. However, I don’t think that you go about marketing this website in the traditional sense. Sure, websites like GoDaddy.com have had tremendous success based on their racy Superbowl commercials, but I think that it would be a lot more beneficial for this sort of search engine, especially if it is trying to take on mammoths like Yahoo! and Google, to spread virally through word of mouth marketing (WOMM).

This marketing is especially effective, because it is usually friends or close loved one’s who refer you to a specific thing. This is how Google initially got its start and how Web 2.0 websites like Facebook, Digg, and a slew of others got their start as well. This is a very effective way to get users to start using your product and tell others about it. Word of mouth creates a buzz and people typically respond very well to it.

Especially when it comes to a new website, people who are part of that demographic who are currently users of Yahoo! and Google, will only move if they have explicitly been told by someone close to them. As for the new users, they will be inspired by the buzz and follow suit.

If the Internet has taught us anything, it is that the world is a constantly changing place that is continually looking for new ways to perform efficiently and effectively. Surely there are Internet search engine giants now that control over 60% (in some cases) of the market, but if we remember the old English adage, “the bigger they are, the harder they fall.”

No one can really predict what will come of the Internet in the next 5, 10, or 20 years from now, but I guarantee you that if we find a better way to do something, there’s no reason why we would stop ourselves from doing it. I believe Doogle can be the effective conglomeration of passive and active cataloging which will make for the best Internet search experience possible.

1 comment:

Anonymous said...

I like where your head is at with this one. You make it sound very viable to take on such giants as Google and Yahoo! I would be interested as to how accurate the 1% market share revenue actually is, but interesting thoughts nevertheless.