Googlebot indexing URL's with ? queries in them. Is this Panda duplicate content?

sparrowdog

I feel like I'm being damaged by Panda because of duplicate content as I have seen the Googlebot on my site indexing hundreds of URL's with ?fsdgsgs strings after the .html. They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later. At a loss what to do. Since Panda, I have lost a couple of dozen #1 rankings that I've held for months on end and had one drop over 100 positions.

sparrowdog

Thanks for all that. Really valuable information. I have gone to Parameter handing and there were 54 parameters listed. In total, generating over 20 million unnecessary URLs. I nearly died when I saw it. We have 6,000 genuine pages and 20 million shitty ones that don't need to be indexed. Thankfully, I'm upgrading next week and I have turned the feature off on the current site, the new one won't have that feature. Phew.

I have changed the settings for these parameters that were already listed in Webmaster tools, and now I wait for the biggest re-index in history LOL!

I have submitted a sitemap now and as I rewrite page titles & meta descriptions, I'm using the Fetch as Google tool to ask for resubmission. It's been a really valuable lesson, and I'm just thankful that I wasn't hit worse than I was. Now, it's a waiting game.

Of my 6,000 URLs' on the site map submitted a couple of days ago, around 1/3 of them have been indexed. When I first uploaded it, only 126 of them were.

JaneCopland

The guys here are all correct - you can handle these in WMT with parameter handling, but as every piece of text about parameter handling states, handle with care. You can end up messing things up big-time if you block areas of the site you do want crawled.

You'll also have to wait days / longer for Google to acknowledge the changes and reflect these in its index and in WMT.

If it's an option, look at using the canonical tag to self-reference: this means that if the CMS creates multiple pages with the same file on different URLs, they'll all point back to the original URL.

David-Kley

"They were beign generated by an add-on filtering module on my store, which I have since turned off. Googlebot is still indexing them hours later."

Google will continue to index them, until you tell them specifically not to do so. Go to GWT, and resubmit a sitemap containing only the URL's you want them to index. Additionally, do a "fetch as Google" on the same pages as your sitemap. This can help to speed up the "reindex" process.

Also, hours? LMAO it will take longer than that. Unless you are a huge site that gets crawled hourly, it can take days, if not weeks for those URL's to disappear. I'm thinking longer since it does not sound like you have redirected those links, just turned off the plugin that was used to create them. Depending on how your store is set up, and how many pages you have, it may be wise to 301 all the offending pages to their proper destination URL.

TheeDigital

Check out parameter exclusion options in Webmaster Tools. You can tell the search engines to ignore these appended parameters.

danwebman

Use a spidering tool to check out all of the links from your site, such as Screaming Frog.

Also check your XML & HTML Site Maps doesn't have old links.

Hope this helps

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Googlebot indexing URL's with ? queries in them. Is this Panda duplicate content?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Duplicate content in sidebar

Duplicate Content on our own website

Duplicate content, is it ever ok?

Help With Duplicated Content

Why isn't our site being shown on the first page of Google for a query using the exact domain, when its pages are indeed indexed by Google

Does schema.org assist with duplicate content concerns

Duplicate content in the title

Number of characters to duplicate content