Free Newsletters
Part of the iEntry Network
of over 4 million subscribers
WebProNewsDE
InvestNewz
CareerNewz

Send me relevant info on products and services.










Indexing Certain Pages Of Your Site

By Mike Moran
Expert Author
Article Date: 2009-03-03

I've often been asked why particular pages are not indexed. Honestly, you can never be sure until you fix the problem. If you think that you've isolated the problem, you never know if you've only corrected one of multiple problems. So, it's best to take it step-by-step.

The first step is be sure that your page really is missing from the search index. (Throughout, I'll be talking about the search "index" but each engine has its own index, so you must check every blessed one individually.)

Most search engines allow the use of a special operator to reveal if a page is in the search index. As an example, if I wanted to know if a page from my Web site was indexed, I could search for "site:www.mikemoran.com/aboutmike/index.htm"--the search engine would show the page in its results if it is in the index.

Assuming that your page is not found using this method, the next step is to try to figure out why. One possibility is that most of your site is missing, which you can determine by a similar search, such as "site:www.mikemoran.com"--you can see how many pages are indexed. If it's very few, your problem is bigger than that single page.

Major site problems include:
  • Spam penalties. If you've been caught violating the search engines' terms of service (spamming), they'll drastically scale back the pages in the index until you beg for reinclusion.

  • Hidden links.If the navigation to your site is hidden within JavaScript, Flash, or other non-HTML methods, the search engine spiders are unlikely to be able to follow them.

  • Dynamic URLs. If your URLs are excessively long, or have many parameters, or contain ID or session parameters, the search engines might elect not to index them.

  • Incorrect robots.txt file. Your robots.txt file tells the search spider which pages to include and exclude from the crawling--if you've coded the file incorrectly, you might be excluding lots of pages you meant to include.

But what if it really is just this one pesky page that isn't being indexed? Some problems are likely to be confined to a single page:
  • Incorrect robots tagging. Just like the robots.txt file, a robots metatag tells the spider to include or exclude an individual page--you might be telling the spider to exclude the page by mistake.

  • User interaction required. If your page launches a pop-up window, or demands that a form be filled out, spiders won't be able to comply.

  • Improper redirects. If your page uses a meta refresh or JavaScript redirect, spiders ignore them and don't index the page.

  • Poor quality pages. If your page is excessively long, contains HTML coding errors, or uses frames, it's unlikely to be indexed correctly.

Once you've identified what's wrong, you can correct the problem and wait for the spiders to come back. Good luck getting all your pages indexed. Remember, if your page isn't in the index, it can never be found.

Comments

About the Author:
Copyright Mike Moran

Mike Moran is an IBM Distinguished Engineer, expert on Internet marketing, and the author of Search Engine Marketing, Inc., the best-selling book on search marketing. Mike also writes the popular Biznology newsletter and blog.



WebProNewsDE is an iEntry, Inc. ® publication - 1998-2009 All Rights Reserved Privacy Policy and Legal