Friday, May 8, 2009

Basic of Google Algorithms

PageRank is a link analysis algorithm used by the Google Internet search engine that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set. The algorithm may be applied to any collection of entities with reciprocal quotations and references. The numerical weight that it assigns to any given element E is also called the PageRank of E and denoted by PR(E).

Google describes PageRank:
“ PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important". ”

In other words, a PageRank results from a "ballot" among all the other pages on the World Wide Web about how important a page is. A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it ("incoming links"). A page that is linked to by many pages with high PageRank receives a high rank itself. If there are no links to a web page there is no support for that page.

Google assigns a numeric weighting from 0-10 for each webpage on the Internet; this PageRank denotes a site’s importance in the eyes of Google. The PageRank is derived from a theoretical probability value on a logarithmic scale like the Richter Scale. The PageRank of a particular page is roughly based upon the quantity of inbound links as well as the PageRank of the pages providing the links. It is known that other factors, e.g. relevance of search words on the page and actual visits to the page reported by the Google toolbar also influence the PageRank.[citation needed] In order to prevent manipulation, spoofing and Spamdexing, Google provides no specific details about how other factors influence PageRank.[citation needed]

Numerous academic papers concerning PageRank have been published since Page and Brin's original paper. In practice, the PageRank concept has proven to be vulnerable to manipulation, and extensive research has been devoted to identifying falsely inflated PageRank and ways to ignore links from documents with falsely inflated PageRank.

Other link-based ranking algorithms for Web pages include the HITS algorithm invented by Jon Kleinberg (used by Teoma and now Ask.com), the IBM CLEVER project, and the TrustRank algorithm.

For in depth understand visit : http://www.ianrogers.net/google-page-rank/


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Type of search engines

In the early 2000s, more than 1,000 different search engines were in existence, although most Web masters focused their efforts on getting good placement in the leading 10. This, however, was easier said than done. InfoWorld explained that the process was more art than science, requiring continuous adjustments and tweaking, along with regularly submitting pages to different engines for good or excellent results. The reason for this is that every search engine works differently. Not only are there different types of search engines—those that use spiders to obtain results, directory-based engines, and link-based engines—but engines within each category are unique. They each have different rules and procedures companies need to follow in order to register their site with the engine.

SPIDER-BASED SEARCH ENGINES
Many leading search engines use a form of software program called spiders or crawlers to find information on the Internet and store it for search results in giant databases or indexes. Some spiders record every word on a Web site for their respective indexes, while others only report certain keywords listed in title tags or meta tags.

DIRECTORY-BASED SEARCH ENGINES
While some sites use spiders to provide results to searchers, others—like Yahoo!—use human editors. This means that a company cannot rely on technology and keywords to obtain excellent placement, but must provide content the editors will find appealing and valuable to searchers. Some directory-based engines charge a fee for a site to be reviewed for potential listing. In the early 2000s, more leading search engines were relying on human editors in combination with findings obtained with spiders. LookSmart, Lycos, AltaVista, MSN, Excite and AOL Search relied on providers of directory data to make their search results more meaningful.

LINK-BASED SEARCH ENGINES
One other kind of search engine provides results based on hypertext links between sites. Rather than basing results on keywords or the preferences of human editors, sites are ranked based on the quality and quantity of other Web sites linked to them. In this case, links serve as referrals. The emergence of this kind of search engine called for companies to develop link-building strategies. By finding out which sites are listed in results for a certain product category in a link-based engine, a company could then contact the sites' owners—assuming they aren't competitors—and ask them for a link. This often involves reciprocal linking, where each company agrees to include links to the other's site.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

History of search engines

Before there were web search engines there was a complete list of all webservers. The list was edited by Tim Berners-Lee and hosted on the CERN webserver. One historical snapshot from 1992 remains. As more and more webservers went online the central list could not keep up. On the NCSA Site new servers were announced under the title "What's New!" but no complete listing existed any more.

The very first tool used for searching on the (pre-web) Internet was Archie. The name stands for "archive" without the "v." It was created in 1990 by Alan Emtage, a student at McGill University in Montreal. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites.

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.

In June 1993, Matthew Gray, then at MIT, produced what was probably the first web robot, the Perl-based World Wide Web Wanderer, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World Wide Web, which it did until late 1995. The search engine Aliweb appeared in November 1993. Aliweb did not use a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a particular format.

JumpStation (released in December 1993) used a web robot to find web pages and to build its index, and used a web form as the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three essential features of a web search engine (crawling, indexing, and searching) as described below. Because of the limited resources available on the platform on which it ran, its indexing and hence searching were limited to the titles and headings found in the web pages the crawler encountered.

One of the first "full text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it let users search for any word in any webpage, which has become the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994 Lycos (which started at Carnegie Mellon University) was launched, and became a major commercial endeavor.

Soon after, many search engines appeared and vied for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista. Yahoo! was among the most popular ways for people to find web pages of interest, but its search function operated on its web directory, rather than full-text copies of web pages. Information seekers could also browse the directory instead of doing a keyword-based search.

In 1996, Netscape was looking to give a single search engine an exclusive deal to be their featured search engine. There was so much interest that instead a deal was struck with Netscape by 5 of the major search engines, where for $5Million per year each search engine would be in a rotation on the Netscape search engine page. These five engines were: Yahoo!, Magellan, Lycos, Infoseek and Excite.

Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s. Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.

Around 2000, the Google search engine rose to prominence.[citation needed] The company achieved better results for many searches with an innovation called PageRank. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal.

By 2000, Yahoo was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002, and Overture (which owned AlltheWeb and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.

Microsoft first launched MSN Search (since re-branded Live Search) in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart blended with results from Inktomi except for a short time in 1999 when results from AltaVista were used instead. In 2004, Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).

As of late 2007, Google was by far the most popular Web search engine worldwide. A number of country-specific search engine companies have become prominent; for example Baidu is the most popular search engine in the People's Republic of China.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

About search engines

A Web search engine is a tool designed to search for information on the World Wide Web. The search results are usually presented in a list and are commonly called hits. The information may consist of web pages, images, information and other types of files. Some search engines also mine data available in newsbooks, databases, or open directories. Unlike Web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Important of SEO (Search Engine Optimization)

So why do people optimize their websites? In many ways, the answer to this question is the same as to why people make websites in the first place… To have their work found and read by as many people as possible.

When it comes to the internet, search engines are by far the most effective tool for finding information, and getting information found. That’s why virtually 100% of people who use the internet also use search engines in some way. It’s also why all webmasters who hope to receive high amounts of traffic should optimize their sites.

For a search engine to recognize the value and relevance of a page, it must receive help from the creator of that page. Properly optimizing your pages to make them “search engine friendly” can greatly increase your search engine rankings, traffic levels, and potential earnings from your website… And that’s the importance of SEO.

Please note that according to statistics:


  • 85% of traffic/visitors are coming from top 3 search engines: Google, Yahoo & MSN!

  • about 75% of visitors are coming from the first 10 listings in the search engines results (page 1).

  • less than 20% of visitors are coming from the second 10 listings in the search engines results (page 2).

  • less than 5% of visitors are coming from the first 10 listings in the search engines results (page 3).

  • the rest little percentage is coming from listings beyond 30.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

What is SEO?

Search engine optimization (SEO) is the process of improving the volume or quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results. Typically, the earlier a site appears in the search results list, the more visitors it will receive from the search engine. SEO may target different kinds of search, including image search, local search, and industry-specific vertical search engines.

As an Internet marketing strategy, SEO considers how search engines work and what people search for. Optimizing a website primarily involves editing its content and HTML coding to both increase its relevance to specific keywords and to remove barriers to the indexing activities of search engines.

The acronym "SEO" can also refer to "search engine optimizers," a term adopted by an industry of consultants who carry out optimization projects on behalf of clients, and by employees who perform SEO services in-house. Search engine optimizers may offer SEO as a stand-alone service or as a part of a broader marketing campaign. Because effective SEO may require changes to the HTML source code of a site, SEO tactics may be incorporated into web site development and design. The term "search engine friendly" may be used to describe web site designs, menus, content management systems and shopping carts that are easy to optimize.

Another class of techniques, known as black hat SEO or Spamdexing, use methods such as link farms and keyword stuffing that degrade both the relevance of search results and the user-experience of search engines. Search engines look for sites that employ these techniques in order to remove them from their indices.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Wednesday, November 26, 2008

Choosing Right Domain Name for your Website

If you already have a good domain name don’t try to purchase a new one, as
some search engines look for the age of the website as a ranking factor.

If you are planning for a new website, try to get a domain name with
keywords included. If you target regional customers, you can have your
domain based on the region, say for example .uk or .au or .in

You might think what’s the name of my business got to do with search engine optimization (SEO)? If you choose the correct business and domain name in the first place it makes the whole optimization process so much easier.

The Right Domain Name = The Right Anchor Text

Zeus Thrones may sound great for a company selling high quality toilet seats, but it’s not going to help potential online customers find your carefully crafted web site via the major search engines, since unless you have a large advertising budget for branding purposes your potential customers won’t know your business or your web site even exists.

If you have a site about Search Engine Optimization (SEO) for example a good business name (for optimization reasons) would be Search Engine Optimization or SEO and the domain name would be search-engine-optimization.tld or seo.tld respectively (tld being com, net, co.uk etc…).

There are other considerations to take into account when choosing a business and domain name including branding and of course available domain names, so compromises have to be made. For example when we were deciding on a business and domain name for this web site we knew we couldn’t have the ideal domain names (for optimization reasons) because others already owned them.


Share this
Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy