Internal linking question
-
Hi there. Are all internal links listed in GWMT actually indexed?
-
Jonnygeekuk,
If GWT is telling you they are "aware" (whether indexed or not) of URLs that you do not want indexed, and you have either blocked them in the robot.txt file or the robots header tag, or the page serves a 404 or 410 response in the http header, it wouldn't hurt to use the URL removal tool to remove those pages from the index just to be sure.
-
So, sounds like you're looking for a list of indexed pages? Will this tool help?
http://www.intavant.com/tools/google-indexed-pages-extractor/
-
I'm sorry it's taking me so long to get back to you on this. However you told me you say you're using the removal tool in Google Webmaster tools?
I want to be certain you're not using the link disavow tool as a removal tool is that correct?
"Google updates its entire index regularly. When we crawl the web, we automatically find new pages, remove outdated links, and reflect updates to existing pages, keeping the Google index fresh and as up-to-date as possible.
If outdated pages from your site appear in the search results, ensure that the pages return a status of either 404 (not found) or 410 (gone) in the header. These status codes tell Googlebot that the requested URL isn't valid. Some servers are misconfigured to return a status of 200 (Successful) for pages that don't exist, which tells Googlebot that the requested URLs are valid and should be indexed. If a page returns a true 404 error via the http headers, anyone can remove it from the Google index using the webpage removal request tool. Outdated pages that don't return true 404 errors usually fall out of our index naturally when other pages stop linking to them."
"
Reincluding content in search
"Content removed using the URL removal tool will not appear in search results for a minimum of 90 days or until the content has been removed from the Google index. However, if you've updated robots.txt, added meta tags, or password-protected content to prevent it being crawled, the content should naturally have dropped out of our index, and you shouldn't need to worry about it reappearing after 90 days. You can reinclude your content at any time during the 90-day period by following the steps below.
Reinclude content:
- On the Webmaster Tools Home page, click the site you want.
- In the left-hand menu, click Optimization, and then click Remove URLs.
- Select the Removed content tab, and then click Reinclude next to the content you want to reinclude in the Google index.
Pending requests are usually processed within 3-5 business days."
-
Hi Chris, Thomas
Thanks for taking the time to reply.
Essentially, the reason i'm asking this question is recently the site in question became heavily over indexed due to search filters etc becoming indexed. This resulted in a ton of thin content being indexed. We've since no indexed these pages but they are taking time to drop off so we are helping a little by using the removal tool in GWMT. A lot of these pages are hidden, it's difficult to find them in the main index but index status says we still have >7k pages indexed when we really should have fewer than 2k. A site: command reveals about 9k but only 600 are listed and they are all valid pages. Basically we're trying to find the urls to remove and noticed that a lot of them are listed in the internal links tab on GWMT. I just wondered whether it was advisable to remove these too, in addition to the 2.5k we have already removed.
-
Hi Johnny, I want to tell you that I agree with what Chris stated above. If you're looking for someone to confirm that. You want to also make sure you do not have over 100 to 150 URLs or internal links on your site. This will hurt Google indexing of the website.
I also use a tool to make internal links. And if that is what you are speaking of. It's called http://scribecontent.com. You can use it not only on word press but on all sites. I have found it to be extremely useful please be cautious though it how many links you built internally so that you do not create a page that cannot be indexed correctly.
http://www.distilled.net/u/search-engine-basics/#crawling
I hope I've been in help,
Thomas
-
Hey JonnyG,
Be sure not to confuse links with URLs. Essentially, a link is clickable thing on a web page that, when clicked, takes the user to another URL. A URL is an address (non-clickable) . A web page is the resource that exists at a URL.
Anyway, the Internal Links tab shows how many links exist on your site that can take you to other pages on your site. However, if you click on the Health | Index Status tab, you'll get choices to see Basic and Advanced info on your indexed URLs. In the advanced tab, you'll see the total number of pages Google's index on your site. Google's Webmaster Tools Help has a page on Index Status for more info.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
.htaccess Question
Hi,I have a website www.contractor-accounts.co.uk that has an .htaccess file that strips .php and forces a closing brace /. The site is now over 6 months old and still has a very low ranking with MOZ also rating the site as DA/PA = 1 which seems to indicate some sort of issue with the website. Can anyone offer any suggestions as to why this site is ranking poorly as much of the onpage SEO has been completed to a level of 90%+ for specific keyterms so I'm probably either looking at routing of the framework of so other technical SEO issues possibly? Any help much apreciated... <ifmodule mod_rewrite.c=""><ifmodule mod_negotiation.c="">Options -MultiViews</ifmodule> RewriteEngine On # Redirect Trailing Slashes...
Technical SEO | | ecrmeuro
# RewriteRule ^(.)/$ /$1 [L,R=301]
RewriteCond %{REQUEST_URI} /+[^.]+$
RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
# Redirect non-WWW to WWW...
RewriteCond %{HTTP_HOST} ^contractor-accounts.co.uk [NC]
RewriteRule ^(.)$ http://www.contractor-accounts.co.uk/$1 [L,R=301] # Handle Front Controller...
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [L]</ifmodule>0 -
Do you still loose 15% of value of inbound links when you redirect your site from http to https (so all inbound links to http are being redirected to https version)?
I know when you redesign your on website, you loose about 15% internally due to the 301 redirects (see moz article: https://moz.com/blog/accidental-seo-tests-how-301-redirects-are-likely-impacting-your-brand), but I'm wondering if that also applies to value of inbound links when you redirect your http://www.sitename.com to https://www.sitename.com. I appreciate your help!
Technical SEO | | JBMediaGroup0 -
Basic Redirection Question
I am doing a 301 Redirect from site ABC to site XYZ. I loaded the following .htaccess file by ftp to the ABC.com/ server: Redirect 301 / http://XYZ.com/ This was completed over 30 days ago, OSE is not showing any of the links and is failing to show that abc.com is redirected even though the MozBar shows a successful 301 http status code. Is this still just a waiting game or is it not advised to do a redirect this way for seo? PS: ahrefs is showing the redirect itself, however, it is not showing the links going to site ABC.com/ as passing to site XYZ.com/ . Any help is appreciated.
Technical SEO | | Vspeed0 -
Links to Website Author
I'm a website developer, and in the past I have usually added a tiny backlink to the footer of my clients' websites like this: Website Design by MyCompanyName I understand that Google sees this as a low-quality backlink. However, I was wondering if such links can hurt my rankings. Does Penguin sees these links as spam? If so, should I add a rel="nofollow" to the links? Is there anything else I should change? I do not want to remove these links completely because they are good for marketing my business. I just want to minimize any negative SEO impact of the links. I appreciate your input. Thanks.
Technical SEO | | SiteWizard_LLC0 -
Forms and link juice
On product listing page on e-commerce site We use POST forms as 'Add To Cart' buttons. Because of that We have dozens (~40-80) forms on any product listing page, and two questions regarding them: Does these forms affect link juice of other links on the page? Are there cases when forms are somehow counted by Google as links? Regards, Lucek
Technical SEO | | lucek0 -
301 Redirect Questions
I have a site I built on a wisiwig editing platform that will not allow a 301 redirect. The site has already been remade and I need to point it to another domain. To do the redirect, can I change it to another domain host that will allow a 301 or will that make me loose the authority of the site? I may not be able to move the content of the site. Please help.
Technical SEO | | photoseo10 -
4XX Broken Links
I am attempting to fix the issues SEOmoz found when crawling my site. I have a list of 4XX errors that I am attempting to fix. Basically I know one option is to redirect them to another page, but I would like to have the option to remove the links completely. The only problem is I can not find where the links are located. Does SEOmoz provide where on my site these broken links are? Or do they only provide the url that is linked to?
Technical SEO | | ClaytonKendall0 -
Value of Twitter Links
Let's ignore the "social metric" value of Twitter links and mentions and look at it from the pure link juice point of view. Twitter accounts such as http://twitter.com/randfish used to have their own PageRank and were treated as separate URLs. Twitter changed that to http://twitter.com/#!/randfish consolidating all their content to a single URL. When I search for "randfish" in Google, however, the result is the first URL version. Some clarification on this matter would be much appreciated.
Technical SEO | | Dan-Petrovic0