Googlebot indexing URL's with ? queries in them. Is this Panda duplicate content?
-
I feel like I'm being damaged by Panda because of duplicate content as I have seen the Googlebot on my site indexing hundreds of URL's with ?fsdgsgs strings after the .html. They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later. At a loss what to do. Since Panda, I have lost a couple of dozen #1 rankings that I've held for months on end and had one drop over 100 positions.
-
Thanks for all that. Really valuable information. I have gone to Parameter handing and there were 54 parameters listed. In total, generating over 20 million unnecessary URLs. I nearly died when I saw it. We have 6,000 genuine pages and 20 million shitty ones that don't need to be indexed. Thankfully, I'm upgrading next week and I have turned the feature off on the current site, the new one won't have that feature. Phew.
I have changed the settings for these parameters that were already listed in Webmaster tools, and now I wait for the biggest re-index in history LOL!
I have submitted a sitemap now and as I rewrite page titles & meta descriptions, I'm using the Fetch as Google tool to ask for resubmission. It's been a really valuable lesson, and I'm just thankful that I wasn't hit worse than I was. Now, it's a waiting game.
Of my 6,000 URLs' on the site map submitted a couple of days ago, around 1/3 of them have been indexed. When I first uploaded it, only 126 of them were.
-
The guys here are all correct - you can handle these in WMT with parameter handling, but as every piece of text about parameter handling states, handle with care. You can end up messing things up big-time if you block areas of the site you do want crawled.
You'll also have to wait days / longer for Google to acknowledge the changes and reflect these in its index and in WMT.
If it's an option, look at using the canonical tag to self-reference: this means that if the CMS creates multiple pages with the same file on different URLs, they'll all point back to the original URL.
-
"They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later."
Google will continue to index them, until you tell them specifically not to do so. Go to GWT, and resubmit a sitemap containing only the URL's you want them to index. Additionally, do a "fetch as Google" on the same pages as your sitemap. This can help to speed up the "reindex" process.
Also, hours? LMAO it will take longer than that. Unless you are a huge site that gets crawled hourly, it can take days, if not weeks for those URL's to disappear. I'm thinking longer since it does not sound like you have redirected those links, just turned off the plugin that was used to create them. Depending on how your store is set up, and how many pages you have, it may be wise to 301 all the offending pages to their proper destination URL.
-
Check out parameter exclusion options in Webmaster Tools. You can tell the search engines to ignore these appended parameters.
-
Use a spidering tool to check out all of the links from your site, such as Screaming Frog.
Also check your XML & HTML Site Maps doesn't have old links.
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't get my preferred URL, how much does it matter?
Hi guys. I'm building a new site at the moment - seen a solid SEO opportunity for my work. I'm a producer engineer, specialising in mixing and mastering, so i'm creating a site for online mixing services. After a bit of keyword research I decided that "online mixing" was the best, most relevant and high volume term to go for. Ideally i'd like my home page to be www.onlinemixing.com (or something similar) but alas! It's been taken, as well as all the variations (like switching words, hypens etc) How much does this matter form an SEO point of view? E.g - For the search term "online mixing" would - www.onlinemixing-signalchain.co.uk be much worse than -www.onlinemixing.co.uk? Or am I sweating the small stuff? Any thoughts would be greatly appreciated. Isaac.
On-Page Optimization | | isaac6630 -
My site's articles seem to never show up in Google.
This is in regards to a previous post that was answered for me:
On-Page Optimization | | Ctrl-Alt-Success
http://moz.com/community/q/my-site-s-name-not-ranking-in-google I was talking to a friend and he suggested I try to type in an article in google with the exact name followed by my site's domain name without the .com For example, I have an article entitled: "MULTITASKING IS BAD FOR YOU, MKAY?" Obviously it's a title most would not word in that way. I typed it in and followed it up with my site's domain minus .com. So "MULTITASKING IS BAD FOR YOU, MKAY? ctrl-alt-success" But I'm not even getting listed in the search. There's got to be something I'm missing. I understand backlinks are important for ranking, but when I'm trying to find an exact match along with my site's url minus the .com? I just have this strong hunch that something is awry. NOTE: It seems this is only with google. If I use Bing or Yahoo, it comes up just fine.0 -
Not sure if I need to be concerned with duplicate content plus too many links
Someone else supports this site in terms of making changes so I want to make sure that I know what I am talking about before I speak to them about changes. We seem to have a lot of duplicate content and duplicate titles. This is an example http://www.commonwealthcontractors.com/tag/big-data-scientists/ of a duplicate. Do I need to get things changed? The other problem that crops up on reports is too many on page links. I am going to get shot of the block of tags but need to keep the news. Is there much else I can do? Many thanks.
On-Page Optimization | | Niamh20 -
Duplicate content on domains we own
Hello! We are new to SEO and have a problem we have caused ourselves. We own two domains GoCentrix.com (old domain) and CallRingTalk.com (new domain that we want to SEO). The content was updated on both domains at about the same time. Both are identical with a few exceptions. Now that we are getting into SEO we now understand this to be a big issue. Is this a resolvable matter? At this point what is the best approach to handle this? So far we have considered a couple of options. 1. Change the copy, but on which site? Is one flagged as the original and the other duplicate? 2. Robots.txt noindex, nofollow on the old one. Any help is appreciated, thanks in advance!
On-Page Optimization | | CallRingTalk0 -
Duplicate content
Hi everybody, I am thrown into a SEO project of a website with a duplicate content problem because of a version with and a version without 'www' . The strange thing is that the version with www. has got more than 10 times more Backlings but is not in the organic index. Here are my questions: 1. Should I go on using the "without www" version as the primary resource? 2. Which kind of redirect is best for passing most of the link juice? Thanks in advance, Sebastian
On-Page Optimization | | Naturalmente0 -
Duplicate content? Not sure.
Good news! I have my first real SEO gig and now I have to be able to actually deliver. I'm up for it but I want to be sure I'm seeing what I think I am before suggesting any changes. I'm working my way throught Danny Dover's excellent book SEO Secrets and learning tons! To see if there is duplicate content on the site, I've taken a sentence from one of the pages on the site and searched for it: i.e., site:storybooksforhealing.com "Some of the most quiet moments are often the most difficult after a loss. Mornings, late nights, time alone." The SERPs show 7 pages that have this text on it. It seems like this is duplicate content, right? This is a Wordpress website so what's happening is the actual page is here: www.storybooksforhealing.com/publish-cup-of-joy/ but there are several archive pages that show excerpts of this text, too. If this is duplicate content (first question) then how would I go about remedying it? Should I set the canonical reference to /publish-cup-of-joy page? Thank you for being patient with my NOOB questions.
On-Page Optimization | | ChristiMc0 -
Appropriate SEO strategies for a website's own SERPs?
Hello all, What are good on-page SEO practices for the search result pages on our own sites? For instance, what page titles do you use? Do you include page numbers? Meta-descriptions? Headers? Keyword utilization? This is a consideration for us as we link to some popular search results on our sites. Thanks!
On-Page Optimization | | DanSerpico0 -
Duplicate content on homepage?
Hi I have just created a new campaign and it states that I have duplicate page content which would affect search rankings. Basically it is counting my site www.mydomain.com and www.mydomain.com/index.php as two seperate pages. How can I make it so that only www.mydomain.com is visible reducing the duplicate content issue? Many Thanks
On-Page Optimization | | idv0