Googlebot indexing URL's with ? queries in them. Is this Panda duplicate content?
-
I feel like I'm being damaged by Panda because of duplicate content as I have seen the Googlebot on my site indexing hundreds of URL's with ?fsdgsgs strings after the .html. They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later. At a loss what to do. Since Panda, I have lost a couple of dozen #1 rankings that I've held for months on end and had one drop over 100 positions.
-
Thanks for all that. Really valuable information. I have gone to Parameter handing and there were 54 parameters listed. In total, generating over 20 million unnecessary URLs. I nearly died when I saw it. We have 6,000 genuine pages and 20 million shitty ones that don't need to be indexed. Thankfully, I'm upgrading next week and I have turned the feature off on the current site, the new one won't have that feature. Phew.
I have changed the settings for these parameters that were already listed in Webmaster tools, and now I wait for the biggest re-index in history LOL!
I have submitted a sitemap now and as I rewrite page titles & meta descriptions, I'm using the Fetch as Google tool to ask for resubmission. It's been a really valuable lesson, and I'm just thankful that I wasn't hit worse than I was. Now, it's a waiting game.
Of my 6,000 URLs' on the site map submitted a couple of days ago, around 1/3 of them have been indexed. When I first uploaded it, only 126 of them were.
-
The guys here are all correct - you can handle these in WMT with parameter handling, but as every piece of text about parameter handling states, handle with care. You can end up messing things up big-time if you block areas of the site you do want crawled.
You'll also have to wait days / longer for Google to acknowledge the changes and reflect these in its index and in WMT.
If it's an option, look at using the canonical tag to self-reference: this means that if the CMS creates multiple pages with the same file on different URLs, they'll all point back to the original URL.
-
"They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later."
Google will continue to index them, until you tell them specifically not to do so. Go to GWT, and resubmit a sitemap containing only the URL's you want them to index. Additionally, do a "fetch as Google" on the same pages as your sitemap. This can help to speed up the "reindex" process.
Also, hours? LMAO it will take longer than that. Unless you are a huge site that gets crawled hourly, it can take days, if not weeks for those URL's to disappear. I'm thinking longer since it does not sound like you have redirected those links, just turned off the plugin that was used to create them. Depending on how your store is set up, and how many pages you have, it may be wise to 301 all the offending pages to their proper destination URL.
-
Check out parameter exclusion options in Webmaster Tools. You can tell the search engines to ignore these appended parameters.
-
Use a spidering tool to check out all of the links from your site, such as Screaming Frog.
Also check your XML & HTML Site Maps doesn't have old links.
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content - Bulk analysis tool?
Hi I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk? I used Copyscape a while ago, but don't remember this having a bulk feature? Thank you!
On-Page Optimization | | BeckyKey0 -
Duplicate page content
Hi Crawl errors is showing 2 pages of duplicate content for my clients WordPress site: /news/ & /category/featured/ Yoast is installed so how best to resolve this ? i see that both pages are canonicalised to themselves so presume just need to change the canonical tag on /category/featured/ to reference /news/ ?(since news is the page with higher authority and the main page for showing this info) or is there other way in Yoast or WP to deal with this & prevent from happening again ? Cheers Dan
On-Page Optimization | | Dan-Lawrence0 -
Pages with near duplicate content
Hi Mozzers, I need your opinion on the following. Imagine that we have a product X (brand Sony for example), so if we sell parts for different models of items of this product X, we then have numerous product pages with model number. Sony camera parts for Sony Camera XYZ parts for Sony Camera XY etc. So the thing is that these pages are very very similar, like 90% duplicate and they do duplicate pages for Panasonic, Canon let's say with small tweaks in content. I know that those are duplicates and I would experiment removing a category for one brand only (least seached for), but at the same time I cannot remove for the rest as they convert a lot, being close to the search query of the customer (customer looks for parts for Sony XYZ, lands on the page and buys, insteading of staying on a page for Sony parts where should additionally browse for model number). What would you advise to make as unique as possible these pages, I am thinking about: change page titles. meta descriptions tweak the content as much as I can (very difficult, there is nothing fancy or different in those :(() i will start with top top pages that really drive traffic first and see how it goes. I will remove least visited pages and prominently put the model number in Sony parts page to see how it goes in terms of organic and most importantly conversions Any other ideas? I am really concerned about dupes and a penalty, but I try to think of solutions in order not to kill conversions at this point. Have a lovely Monday
On-Page Optimization | | SammyT0 -
Using phrases like 'NO 1' or 'Best' int he title tag
Hi All, Quick question - is it illegal, against any rule etc to use phrases such as 'The No 1 rest of the title tag | Brand Name' on a site?
On-Page Optimization | | Webrevolve0 -
Is this duplicate content okay?
We have a client who wants to rank locally, nationally and internationally for their products. I wrote a line that goes, "We can ship our products to you whether you’re here in Illinois, nationwide, or international." I added that line after a paragraph or two of unique product description on each of their 30-odd product pages. Will this damage their ranking? I tried researching this but only found full page duplicate content topics. Any advice would be great.
On-Page Optimization | | optimalwebinc0 -
Creating Authority and choosing URL's
Creating Domain Authority and choosing URL's: A: What is better if you want to get higher Domain Authority? Choose keyword.domain.com or www.domain.com/keyword when other sites link to it? B: And for Page Authority? Choose keyword.domain.com or www.domain.com/keyword? Thanks!
On-Page Optimization | | HMK-NL0 -
Numbers in URL's - Search friendly or not?
Hi Mozzers, I have a client who has just launched a new website and we are having difficulties in making the URL's search friendly. I wont get into the technical aspects, but I'll explain the potential solutions the developers have given me. current: www.site.com/en/product/browse-by-product/37/22 Where 'en' stands for the English version of the website, 37 is the product category for example 'hard drives', and 22 is the product name or example 'seagate' Option to fix; www.site.com/en/p/product/hard-drives-37/seagate-22 This optional fix reduces the word product down to p, reduces 'browse by product' to 'product' and inserts the category and product names. Note the category identifier '37' has to be included in the URL, and the product identifier '22' also has to be in the URL. Obviously this is not great, but it is required at the moment. Best case scenario would be to have the URL like this... www.site.com/en/hard-drives/seagate So my question is, how far off the best case scenario is the option to fix? Scale of 1 to 10 would be good?
On-Page Optimization | | JoeyDorrington0 -
Why isn't Google indexing me?
Recently got handed off a .org site for a quasi state agency here in Michigan. Turns out the developer had the site live for the past six months but left the noindex, nofollow tag on everything so the site was invisible to search engines. Obviously we wiped all of those things a couple weeks ago when we got started, added all of our sitemaps to bing/yahoo/google webmaster tools and we've already started getting indexed by yahoo and bing and showing up for branded terms...but NOTHING from Google. WMT says our pages are all indexed, but we aren't showing up for anything in search and we don't seem to be indexed at all. Granted, if this site was brand new and didn't have any links I could see us taking a little time to get found, but this site has very good .gov and .edu links, plus we've built some other solid links to it since we've launched and Google continues to ignore it. I haven't seen this before, but could Google still be ignoring us from the months of noindex, nofollowing? If so, any tips on how to get back in teh Google's good graces here?
On-Page Optimization | | NetvantageMarketing0