Was looking through the newsgroup alt.internet.search-engines which used to be a great resource :-(

Anyway, a new poster asked an excellent question regarding linking to non-cached pages (from the newsgroup thread “Re: should I reject non-cached sites?” edited version):

I run a modest linking scheme on my website and have always had a policy that we would only link to sites that are cached in Google. I’m not so fussy about PageRank at the moment as our link pages are not all ranked.

I’ve had a number of link requests from sites that look pretty well set up but the link page is not cached in Google.

Am I right to reject these as they are no use to my site in terms of IBLs (Inbound Links)? I use the Google toolbar to check the site is cached and have up to now rejected those with no Google cache. Am I rejecting potentially useful sites or doing the right thing?

Although the thread is 4 days old with 10 replies not one answers this posters good question. So here’s a answer from a real SEO Consultant.

The whole point of setting up reciprocal link schemes is to generate text links that increase a sites search engine rankings (Google rankings mostly). So if a link exchange doesn’t help a sites SERPs long term, then from a ranking perspective there’s little point setting up the reciprocal link in the first place (unless you see another reason to go ahead with the link exchange: for example you see potential in the site).

With that in mind when setting up reciprocal links your partners link page should be at least indexed in Google and cached to be sure the page can potentially** pass benefit through text links.

Note: there are ways like rel=nofollow or javascripting links to prevent link benefit passing via a link from an indexed/cached page, so this is one step in the process of checking a reciprocal link partners site out.

The reason for this is obvious, to benefit from their return link Google must know your link exists, if Google doesn’t even spider/index the page your text link is on you will not receive any link benefit!

That leads us to how to accurately check if a page is indexed and passing link benefit (PR).

How to Use the Google Search Site: Operative

The easiest way is by using the search site: operative in Google. To see if a page is indexed use this search format in Google:

site:http://www.seo-consultant-services.co.uk/seo-advice-text-links.html

You should see the following in Google for the site: search above:

Google Site Search

The example above is searching for a highly specific page on this site in Google, you can search an entire site or section/sub-set of the site by using the right search set. Take this one:

site:http://www.seo-gold.com/seo-

This will list all indexed pages under the domain http://www.seo-gold.com that is within directories directly off of root starting with “seo-” and file names starting with “seo-”, and will miss anything under a directory like http://www.seo-gold.com/tag/ or http://www.seo-gold.com/2007/. This allows us to search specific parts of a site narrowing a search to a limited number of indexed pages.

By doing a site: search on a specific page you’ll quickly determine if Google indexed it recently. If you find a page is not found using the site: search it either means it’s a new page (takes time for Google to spider and index a page), Google spidered the page, but decided it doesn’t meet it’s criteria for indexing (happens a lot more often now) or the webmaster blocked spidering/indexing of the page in question (via robots.txt for example).

At this point if you aren’t particularly SEO/web savvy if you find a page isn’t indexed refuse the link exchange unless there’s another good reason to proceed.

How to Check if a Page is Cached in Google

So we find a page is indexed, next we check it’s cache using the following search format:

cache:http://www.seo-consultant-services.co.uk/seo-advice-text-links.html

You can also see from the earlier example a cache text link directly from the site: search results (the “Cached” link). Click the cached link and you’ll get a page like Site: Search Cache of http://www.seo-consultant-services.co.uk/seo-advice-text-links.html which in this case should show a copy of the page in question.

Now the poster of the original SEO question didn’t use the site: search technique to generate a link to the cached page, they used the Google toolbar built in cache link. If you’ve installed the Google toolbar “right click” on a page you want to view the cache and follow “Page Info/Cached Snapshot of Page” link.

Google Toolbar Cached Snapshot

You can also check by clicking the PageRank bar on the Google toolbar for the page if you activated the PR bar.

Cache via the PR Button

When I do this for the page in question it opens up this page: Google Toolbar Cache of http://www.seo-consultant-services.co.uk/seo-advice-text-links.html

That gives us two ways to generate a cached page (note the different URLs, different servers)! Usually these are identical, but occasionally you’ll find one form has a different cache and sometimes (usually the toolbar cache) doesn’t show a cache at all! I would advise trusting the site: search cache over the toolbar cache based on my experience.

To summarise, a linking page should be indexed (site: search) and cached (site: search cache) to be reasonably confident Google is spidering, indexing AND ’seeing’ the same page you are.

When to Link to Non-Cached Pages

There are legitimate reasons for some pages not having a Google cache, some webmasters set a page to no-cache so search engine visitors etc… always see the very latest version of their content. Obvious reasons for this include frequently updated content, say a news site or stock quote site. With a site like this there should be no problems accepting link exchanges, your site will benefit from the text link.

There’s several ways to specify no-cache including adding the following to the head of a page-

<META HTTP-EQUIV="CACHE-CONTROL" CONTENT="NO-CACHE">

and/or

<META HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">

If a possible reciprocal link partners link pages are indexed (can find via a site: search) but not cached check if something like the above is present within the head of the page. There’s also server side no-caching that’s not so easy to spot.

Be a little more wary of webmasters that use no-cache as it’s part of a few SEO blackhat techniques like cloaking: you present one page (highly optimized page) to search engine spiders show sales content to real visitors. A visitor checking the cache of a page like this will spot the discrepancy and maybe report the site to the search engines, by using no-cache no one gets to see what the search engines actually see.

Cloaking like this can also be achieved for a specific visitor. If the potential reciprocal link partner knows your IP address (static IP obtained via emails discussing a potential link exchange) they can cloak their links pages to show your links when and only when you visit the site. If you check the cache though you’ll notice the lack of your link… and so when this technique is used they tend to no-cache the pages. Though they’ll use server side no-cache not the meta ones listed above.

So be very careful with webmasters that use no-cache, ask yourself does that sites content really warrant the use of no-cache? If not I wouldn’t risk it.

Popularity: 69%