A few misc Webmaster tools questions & Robots.txt etc

Dan-Lawrence

Hi

I have a few general misc questions re Robots.tx & GWT:

1) In the Robots.txt file what do the below lines block, internal search ?

Disallow: /?
Disallow: /*?

2) Also the sites feeds are blocked in robots.txt, why would you want to block a sites feeds ?

**3) **What's the best way to deal with the below:

- old removed page thats returning a 500 response code ?

- a soft 404 for an old removed page that has no current replacement

old removed pages returning a 404

The old pages didn't have any authority or inbound links hence is it best/ok to simply create a url removal request in GWT ?

Cheers

Dan

Dan-Lawrence

Many Thanks Stufroguk !!

PremierBusinessCare

It depends if Google had index these 'empty' pages. You need to check. Remember that every page is also give page authority. Best to redirect them before removing them as best practice. You can get Google to fetch the pages in GWTs so that the crawlers follow the redirect. Then remove them.
Your old pages - fetch them in GWT's, then remove them if you already have the 301's set up. Once google has indexed the new pages, you know the link juice has passed and can remove.

The blocking is used as a back up.

Dan-Lawrence

Thanks Stufroguk,

1) does this still apply if the pages had no content - they were just overview pages/folders without any copy, links or authority hence why i think its ok to just remove urls without 301'ing ?

2) i do have other old content pages that i have 301'd to new replacement but hadnt planned to do anything else with them, but your saying after 2 weeks should nofollow or block them ? wont that stop the link equity passing ?

Cheers

Dan

PremierBusinessCare

To manage old pages it's best practice to simply 301 redirect them, leave them for a couple of weeks then tag them with no follow and/or block them with robots. That way you've passed on the link equity. Then you can remove them from GWT's.

In answer to 1. yes But not all SE's read the "*" wildcard in file names. You might need to tinker with this a bit.

Use this to help:http://tool.motoricerca.info/robots-checker.phtml

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

A few misc Webmaster tools questions & Robots.txt etc

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools

Crawl solutions for landing pages that don't contain a robots.txt file?

Disallow wildcard match in Robots.txt

Questionable Referral Traffic

Site (Subdomain) Removal from Webmaster Tools

Un-Indexing a Page without robots.txt or access to HEAD

Robots.txt question

Sitemap question