I have more pages in my site map being blocked by the robot file than I have being allowed to be crawled. Is Google going to hate me for this?
-
Using some rules to block all pages which start with "copy-of" on my website because people have a bad habit of duplicating new product listings to create our refurbished, surplus etc. listings for those products. To avoid Google seeing these as duplicate pages I've blocked them in the robot file, but of course they are still automatically generated in our sitemap. How bad is this?
-
When you say "people," are you saying your own web team duplicates content to make their job easier? Or am I missing something?...
If that's the case, you really should create unique URL's with unique page titles, product info, etc. That's the correct way to avoid getting hit for duplicate content - don't create it. It seems like what you're doing now is more of a band-aid solution to the problem.
I'd consider that even though creating unique content in situations like this can seem daunting and/or be more expensive, there's probably huge long-term gains to made if you do it right.
-
It is not bad, just not best practices because Google will still index the URL's if they are mentioned on other pages. Just to quote them:
"While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information..."
What I would do instead is either use rel="canonical" or 301 redirects. I hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google ranking content for phrases that don't exist on-page
I am experiencing an issue with negative keywords, but the “negative” keyword in question isn’t truly negative and is required within the content – the problem is that Google is ranking pages for inaccurate phrases that don’t exist on the page. To explain, this product page (as one of many examples) - https://www.scamblermusic.com/albums/royalty-free-rock-music/ - is optimised for “Royalty free rock music” and it gets a Moz grade of 100. “Royalty free” is the most accurate description of the music (I optimised for “royalty free” instead of “royalty-free” (including a hyphen) because of improved search volume), and there is just one reference to the term “copyrighted” towards the foot of the page – this term is relevant because I need to make the point that the music is licensed, not sold, and the licensee pays for the right to use the music but does not own it (as it remains copyrighted). It turns out however that I appear to need to treat “copyrighted” almost as a negative term because Google isn’t accurately ranking the content. Despite excellent optimisation for “Royalty free rock music” and only one single reference of “copyrighted” within the copy, I am seeing this page (and other album genres) wrongly rank for the following search terms: “free rock music”
On-Page Optimization | | JCN-SBWD
“Copyright free rock music"
“Uncopyrighted rock music”
“Non copyrighted rock music” I understand that pages might rank for “free rock music” because it is part of the “Royalty free rock music” optimisation, what I can’t get my head around is why the page (and similar product pages) are ranking for “Copyright free”, “Uncopyrighted music” and “Non copyrighted music”. “Uncopyrighted” and “Non copyrighted” don’t exist anywhere within the copy or source code – why would Google consider it helpful to rank a page for a search term that doesn’t exist as a complete phrase within the content? By the same logic the page should also wrongly rank for “Skylark rock music” or “Pretzel rock music” as the words “Skylark” and “Pretzel” also feature just once within the content and therefore should generate completely inaccurate results too. To me this demonstrates just how poor Google is when it comes to understanding relevant content and optimization - it's taking part of an optimized term and combining it with just one other single-use word and then inappropriately ranking the page for that completely made up phrase. It’s one thing to misinterpret one reference of the term “copyrighted” and something else entirely to rank a page for completely made up terms such as “Uncopyrighted” and “Non copyrighted”. It almost makes me think that I’ve got a better chance of accurately ranking content if I buy a goat, shove a cigar up its backside, and sacrifice it in the name of the great god Google! Any advice (about wrongly attributed negative keywords, not goat sacrifice ) would be most welcome.0 -
Should you do on-page optimization for a page with rel=canonical tag?
If you ad a rel=canonical tag to a page, should you still optimize that page? I'm talking meta description, page title, etc.
On-Page Optimization | | marynau0 -
My site's articles seem to never show up in Google.
This is in regards to a previous post that was answered for me:
On-Page Optimization | | Ctrl-Alt-Success
http://moz.com/community/q/my-site-s-name-not-ranking-in-google I was talking to a friend and he suggested I try to type in an article in google with the exact name followed by my site's domain name without the .com For example, I have an article entitled: "MULTITASKING IS BAD FOR YOU, MKAY?" Obviously it's a title most would not word in that way. I typed it in and followed it up with my site's domain minus .com. So "MULTITASKING IS BAD FOR YOU, MKAY? ctrl-alt-success" But I'm not even getting listed in the search. There's got to be something I'm missing. I understand backlinks are important for ranking, but when I'm trying to find an exact match along with my site's url minus the .com? I just have this strong hunch that something is awry. NOTE: It seems this is only with google. If I use Bing or Yahoo, it comes up just fine.0 -
Should a company worry about how many domains it maps to the same home page?
I seem to be at logger heads with developers regarding domain mapping. The scenario: I have a company with one site on a primary domain name, but all the other domains they own are mapped using a tool provided by their hosting vendor. But. what I see is a keyword loaded domain that shows it has been 'mapped' to the primary domain, but you can type into the browser this keyword loaded domain and it will serve up in your browser that same home page you see on the PRIMARY DOMAIN. So, picture this - you are looking at the home page on wwww.keyworddomain.com and see the same home page as www.primarycompanydomain.com - but if you select anything from the menu at www.keyworddomain.com you will be taken immediately to www.primarycompanydomain.com/page-you-selected I just get a feeling this is not right as I can search Google for www.keyworddomain.com and Google lists the site home page on that domain. But when I click through from the listed result, I am taken to www.primarycompanydomain.com which is ideally where I want to be and I would want Google to focus on this domain, and I have told it to do so within the feature included within Google Webmaster Tools. The developers say there is nothing wrong. There argument - why would a hosting company provide this domain mapping feature if it was not best practice. My argument - but Google is listing that domain URL (www.keyworddomain.com) despite the fact it takes me through to www.primarycompanydomain.com - will Google not think this strange despite me telling it via GWMT that www.primarycompanydomain.com is the one and only domain I am working on. Tell me if I am going mad or not, and who is right and who is wrong. Appreciate all your answers.
On-Page Optimization | | ICTADVIS0 -
Category listing page coming above product pages
A new SEO client we have taken on seem to be hitting most of the points right on with their site and SEO. However one thing that is bugging me is that their category pages i.e. "Footwear" which title tag includes the brands they stock. Is almost always coming up above (if they are ever even found) the product individual pages. Anyone seen this sort of things happening? Very frustrating.
On-Page Optimization | | iboxsecurityltd0 -
Autogenerated pages
My main product is database conversion software. As it supports tons of databases, it's fairly easy to generate thousands of landing pages simply by variating source/target database names, connection information etc. In fact, I autogenerated almost 25k pages that way. As I didn't want to jeopardize my main site, I placed all that content to a new microsite (www.fullconvert.com) which had no history and no inbound links. Results were nice - site is live two months and in second month already had 1300 visitors. Now, my question is - should I create the same thing on my (old and rather authoritative) main site www.spectralcore.com? I could use a different template to avoid duplicate content. Of course, my main concern is being penalized by Google. In my opinion, this autogenerated content is fine because it provides (tons of) laser-focused landing pages, so visitors will instantly recognize they found what they're looking for. But Google might disagree! What do you think? Is there a danger in trying to leverage authority of my main site in adding 20k+ autogenerated pages with inbound no links to them?
On-Page Optimization | | metadata0 -
Hi Moz peeps, google said that we should have two site maps...
Hi Moz peeps, google said that we should have two site maps... one for google and one for people. Now i know having a site map to submit to google for the first time is important for SEO, but is having a site map for people visitng the site help at all in terms of google's bots crawling your site? I know it might actually help human people navigate through your site, I just want to know if by not having it or having it affects on page SEO at all, thanks guys
On-Page Optimization | | david3051 -
Should H1s be used in the logo? If they are and it is dynamic on each page to relate to the page content, is this detrimental to the site rather than having it in the page content?
On some sites, the H1 is contained within the logo and remains consistent throughout the site (i.e. the company name is in the of the logo). If the h1 in a logo is dynamic for each page (i.e. on the homepage it is company name - homepage) is this better or worse to have it changed out on the logo rather than having it in the page content?
On-Page Optimization | | CabbageTree0