Why are so many pages indexed?
-
We recently launched a new website and it doesn't consist of that many pages. When you do a "site:" search on Google, it shows 1,950 results. Obviously we don't want this to be happening. I have a feeling it's effecting our rankings. Is this just a straight up robots.txt problem? We addressed that a while ago and the number of results aren't going down. It's very possible that we still have it implemented incorrectly. What are we doing wrong and how do we start getting pages "un-indexed"?
-
What's to stop google from finding them? They're out there and available on the internet!
Block or remove pages using a robots.txt file
You can do this by putting:
User-agent: * Disallow: /
in the robots.txt file.
You might also want to stop humans from accessing the content too - can you put this content behind a password using htaccess or block access based on network address?
-
Sounds like you need to put a robots.txt on those subdomains (and maybe consider some type of login too).
Quick fix: put a robots.txt on the subdomains to block them from being indexed. Go into Google Webmaster Tools and verify each subdomain as its own site, then request removal of each of those subdomains (which should be approved, since you've already blocked it in robots.txt).
I took a quick look at lab.capacity.com/robots.txt and it isn't blocking the entire subdomain, though the robots.txt at fb.capacitr.com is.
-
I most certainly do not want those pages indexed, they're used for internal purposes only. That's exactly what I'm trying to figure out here. Why are those subdomains being indexed? They should obviously be private. Any insights would be great.
Thanks!
-
What are are you searching for? I notice that if you do a site:.capacitr.com you get the 1,950 results you mention above.
If you do a search for site:www.capacitr.com then you only get 29 results.
Its looks like there's a whole load of pages being indexed on other subdomains - fb.capacitr.com and lab.capacity.com. (Which has 1,860 pages!)
What are these used for, do you really want these in the index!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirecting homepage to internal page (2nd Tier page)
We are planning to experiment redirecting our homepage to one of the 2nd tier page. I mean....example.com to example.com/page. We need this page to rank well, but it doesn't have much internal links or external back-links, so we opt for this redirect. Advantage with this page is, it has "keyword" we want to rank for in URL. "page" in example.com/page. Will this help or hurt us in SEO? I think we are missing keyword in our root domain, so interested to highlight this page. Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
SEO: How to change page content + shift its original content to other page at the same time?
Hello, I want to replace the content of one page of our website (already indexeed) and shift its original content to another page. How can I do this without problems like penalizations etc? Current situation: Page A
Intermediate & Advanced SEO | | daimpa
URL: example.com/formula-1
Content: ContentPageA Desired situation: Page A
URL: example.com/formula-1
Content: NEW CONTENT! Page B
URL: example.com/formula-1-news
Content: ContentPageA (The content that was in Page A!) Content of the two pages will be about the same argument (& same keyword) but non-duplicate. The new content in page A is more optimized for search engines. How long will it take for the page to rank better?0 -
Links / Top Pages by Page Authority ==> pages shouldnt be there
I checked my site links and top pages by page authority. What i have found i dont understand, because the first 5-10 pages did not exist!! Should know that we launched a new site and rebuilt the static pages so there are a lot of new pages, and of course we deleted some old ones. I refreshed the sitemap.xml (these pages are not in there) and upload it in GWT. Why those old pages appear under the links menu at top pages by page authority?? How can i get rid off them? thx, Endre
Intermediate & Advanced SEO | | Neckermann0 -
Do I have to many internal links which is diluting link juice to less important pages
Hello Mozzers, I was looking at my homepage and subsequent category landing pages on my on my eCommerce site and wondered whether I have to many internal links which could in effect be diluting link juice to much of the pages I need it to flow. My homepage has 266 links of which 114 (43%) are duplicate links which seems a bit to much to me. One of my major competitors who is a national company has just launched a new site design and they are only showing popular categories on their home page although all categories are accessible from the menu navigation. They only have 123 links on their home page. I am wondering whether If I was to not show every category on my homepage as some of them we don't really have any sales from and only concerntrate on popular ones there like my competitors , then the link juice flowing downwards in the site would be concerntated as I would have less links for them to flow ?... Is that basically how it works ? Is there any negatives with regards to duplicate links on either home or category landing page. We are showing both the categories as visual boxes to select and they are also as selectable links on the left of a page ? Just wondered how duplicate links would be treated? Any thoughts greatly appreciated thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Effect of Removing Footer Links In all Pages Except Home Page
Dear MOZ Community: In an effort to improve the user interface of our business website (a New York CIty commercial real estate agency) my designer eliminated a standardized footer containing links to about 20 pages. The new design maintains this footer on the home page, but all other pages (about 600 eliminate the footer). The new design does a very good job eliminating non essential items. Most of the changes remove or reduce the size of unnecessary design elements. The footer removal is the only change really effect the link structure. The new design is not launched yet. Hoping to receive some good advice from the MOZ community before proceeding My concern is that removing these links could have an adverse or unpredictable effect on ranking. Last Summer we launched a completely redesigned version of the site and our ranking collapsed for 3 months. However unlike the previous upgrade this modifications does not URL names, tags, text or any major element. Only major change is the footer removal. Some of the footer pages provide good (not critical) info for visitors. Note the footer will still appear on the home page but will be removed on the interior pages. Are we risking any detrimental ranking effect by removing this footer? Can we compensate by adding text links to these pages if the links from the footer are removed? Seems irregular to have a home page footer but no footer on the other pages. Are we inviting any downgrade, penalty, adverse SEO effect by implementing this? I very much like the new design but do not want to risk a fall in rank and traffic. Thanks for your input!!!
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
Adding Orphaned Pages to the Google Index
Hey folks, How do you think Google will treat adding 300K orphaned pages to a 4.5 million page site. The URLs would resolve but there would be no on site navigation to those pages, Google would only know about them through sitemap.xmls. These pages are super low competition. The plot thickens, what we are really after is to get 150k real pages back on the site, these pages do have crawlable paths on the site but in order to do that (for technical reasons) we need to push these other 300k orphaned pages live (it's an all or nothing deal) a) Do you think Google will have a problem with this or just decide to not index some or most these pages since they are orphaned. b) If these pages will just fall out of the index or not get included, and have no chance of ever accumulating PR anyway since they are not linked to, would it make sense to just noindex them? c) Should we not submit sitemap.xml files at all, and take our 150k and just ignore these 300k and hope Google ignores them as well since they are orhpaned? d) If Google is OK with this maybe we should submit the sitemap.xmls and keep an eye on the pages, maybe they will rank and bring us a bit of traffic, but we don't want to do that if it could be an issue with Google. Thanks for your opinions and if you have any hard evidence either way especially thanks for that info. 😉
Intermediate & Advanced SEO | | irvingw0 -
How to resolve Duplicate Page Content issue for root domain & index.html?
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
Intermediate & Advanced SEO | | ContentWriterMicky0 -
Should you stop indexing of short lived pages?
In my site there will be a lot of pages that have a short life span of about a week as they are items on sale, should I nofollow the links meaning the site has a fwe hundred pages or allow indexing and have thousands but then have lots of links to pages that do not exist. I would of course if allowing indexing make sure the page links does not error and sends them to a similarly relevant page but which is best for me with the SEarch Engines? I would like to have the option of loads of links with pages of loads of content but not if it is detrimental Thanks
Intermediate & Advanced SEO | | barney30120