Internal linking question
-
Hi there. Are all internal links listed in GWMT actually indexed?
-
Jonnygeekuk,
If GWT is telling you they are "aware" (whether indexed or not) of URLs that you do not want indexed, and you have either blocked them in the robot.txt file or the robots header tag, or the page serves a 404 or 410 response in the http header, it wouldn't hurt to use the URL removal tool to remove those pages from the index just to be sure.
-
So, sounds like you're looking for a list of indexed pages? Will this tool help?
http://www.intavant.com/tools/google-indexed-pages-extractor/
-
I'm sorry it's taking me so long to get back to you on this. However you told me you say you're using the removal tool in Google Webmaster tools?
I want to be certain you're not using the link disavow tool as a removal tool is that correct?
"Google updates its entire index regularly. When we crawl the web, we automatically find new pages, remove outdated links, and reflect updates to existing pages, keeping the Google index fresh and as up-to-date as possible.
If outdated pages from your site appear in the search results, ensure that the pages return a status of either 404 (not found) or 410 (gone) in the header. These status codes tell Googlebot that the requested URL isn't valid. Some servers are misconfigured to return a status of 200 (Successful) for pages that don't exist, which tells Googlebot that the requested URLs are valid and should be indexed. If a page returns a true 404 error via the http headers, anyone can remove it from the Google index using the webpage removal request tool. Outdated pages that don't return true 404 errors usually fall out of our index naturally when other pages stop linking to them."
"
Reincluding content in search
"Content removed using the URL removal tool will not appear in search results for a minimum of 90 days or until the content has been removed from the Google index. However, if you've updated robots.txt, added meta tags, or password-protected content to prevent it being crawled, the content should naturally have dropped out of our index, and you shouldn't need to worry about it reappearing after 90 days. You can reinclude your content at any time during the 90-day period by following the steps below.
Reinclude content:
- On the Webmaster Tools Home page, click the site you want.
- In the left-hand menu, click Optimization, and then click Remove URLs.
- Select the Removed content tab, and then click Reinclude next to the content you want to reinclude in the Google index.
Pending requests are usually processed within 3-5 business days."
-
Hi Chris, Thomas
Thanks for taking the time to reply.
Essentially, the reason i'm asking this question is recently the site in question became heavily over indexed due to search filters etc becoming indexed. This resulted in a ton of thin content being indexed. We've since no indexed these pages but they are taking time to drop off so we are helping a little by using the removal tool in GWMT. A lot of these pages are hidden, it's difficult to find them in the main index but index status says we still have >7k pages indexed when we really should have fewer than 2k. A site: command reveals about 9k but only 600 are listed and they are all valid pages. Basically we're trying to find the urls to remove and noticed that a lot of them are listed in the internal links tab on GWMT. I just wondered whether it was advisable to remove these too, in addition to the 2.5k we have already removed.
-
Hi Johnny, I want to tell you that I agree with what Chris stated above. If you're looking for someone to confirm that. You want to also make sure you do not have over 100 to 150 URLs or internal links on your site. This will hurt Google indexing of the website.
I also use a tool to make internal links. And if that is what you are speaking of. It's called http://scribecontent.com. You can use it not only on word press but on all sites. I have found it to be extremely useful please be cautious though it how many links you built internally so that you do not create a page that cannot be indexed correctly.
http://www.distilled.net/u/search-engine-basics/#crawling
I hope I've been in help,
Thomas
-
Hey JonnyG,
Be sure not to confuse links with URLs. Essentially, a link is clickable thing on a web page that, when clicked, takes the user to another URL. A URL is an address (non-clickable) . A web page is the resource that exists at a URL.
Anyway, the Internal Links tab shows how many links exist on your site that can take you to other pages on your site. However, if you click on the Health | Index Status tab, you'll get choices to see Basic and Advanced info on your indexed URLs. In the advanced tab, you'll see the total number of pages Google's index on your site. Google's Webmaster Tools Help has a page on Index Status for more info.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
General Question: Linking Root Domains 0
Hi, we have several subpages with PA 1. We try to figure out why. The link metrics show 0 Linking root domains, but shouldnt there be at least the own domain like 1 linking root domain or are we getting it already wrong here? The subpages have following link metrics: 0 External followed Links. 8416 subdomain and 8421 Root Domain, Linking Root Domains 0, 299 Subdomain, 302 Root Domain (From Mozbar). The pages seem to be crawled. We are suspecting techincal reasons. What would be the impact of the linking root domain to the PA 1? Thank you in advance
Technical SEO | | brainfruit0 -
Link Anchor Text
When we have run a Open Site Explorer analysis on our own site, it says that for all our internal links the Link Anchor Text is 'Help with logging in' I am a bit confused as to why it shows that. That text does appear in the header of the page, but is not the first piece of text. Why is it happening on our site?
Technical SEO | | MattAshby
Why do I not see this on other sites?
What affect does this have on our ranking?
What's the best fix? Example page that we ran on Open Site Explorer: www.rightboat.com/search?manufacturer=Beneteau&model=Antares+9.800 -
What do I do with these back links?
In the last two weeks, I've got 10 pingbacks from this http://caraccidentlawyer.cc/coroner-ids-berkeley-bodies-who-were-killed-in-recent-car-accident/ and sites like it. The featured attorney is a competitor of ours and, since the links aren't sex/drugs/rock&roll related, (and he's linked too) I doubt this is a negative SEO campaign, but I want it to stop. These blogs are basically pure spam. Any suggestions?
Technical SEO | | KempRugeLawGroup1 -
Links in a Flash document
How do I tell if a link in a Flash document is follow or nofollow? Or doesn't it matter? (I just found out that my company placed an advertorial in a Flash publication and I want to make sure it doesn't wind up as a paid, followed link.) Thank you!
Technical SEO | | Linda-Vassily0 -
Should I add 'nofollow' to site wide internal links?
I am trying to improve the internal linking structure on my site and ensure that the most important pages have the most internal links pointing to them (which I believe is the best strategy from Google's perspective!). I have a number of internal links in the page footer going to pages such as 'Terms and Conditions', 'Testimonials', 'About Us' etc. These pages, therefore, have a very large number of links going to them compared with the most important pages on my site. Should I add 'nofollow' to these links?
Technical SEO | | Pete40 -
Drupal Question
So on our site we have a plugin for our fan gallery. The issue is that I am getting a lot of duplication errors and it's saying the URL is too long and all the errors are coming from the Fan Gallery, which has over 8,000 errors. It seems to be pulling a long form query URL that has over 100 characters. You can't physically see it on the site, but the crawlers can. Anyway I'm trying to figure out a fix for this. One method would be to just stop those pages from being crawled, but I would hate to do that as the fan gallery for us would be a great source of links and content. So I'm wondering if anyone else has had an issue with these types of plugins before where the user can upload a photo or do a video embed and then it submits to the site. If you have a better method please let me know. I usually work on E-comm platforms so my experience with drupal is limited.
Technical SEO | | KateGMaker0 -
Can I reduce link count by no following links?
Hi, A large number of my pages contain over 100 links. This is due to a large drop down navigation which is on every page. To reduce my link count could I just no follow these navigation links or would I have to remove the navigation completely?
Technical SEO | | moesian0 -
Internal Linking: Site-wide VS Content Links
I just watched this video in which Matt Cutts talks about the ancient 100 links per page limit. I often encounter websites which have massive navigation (elaborate main menu, side bar, footer, superfooter...etc) in addition to content area based links. My question is do you think Google passes votes (PageRank and anchor text) differently from template links such as navigation to the ones in the content area, if so have you done any testing to confirm?
Technical SEO | | Dan-Petrovic0