Removing duplicate &var=1 etc var name urls from google
-
Hi I had a huge drop in traffic around the 11th of july over 50% down with no recovery as yet... ~5000 organic visits per day down to barley over 2500.
I fixed up a problem that one script was introducing that had caused high bounce rates.
Now i have identified that google has indexed the entire news section 4 times, same content but with var=0 var=1 2 3 etc around 40,000 urls in total.
Now this would have to be causing problems.
I have fixed the problem and those url's 404 now, no need for 301's as they are not linked to from anywhere.
How can I get them out of the index? I cant do it one by one with the url removal request.. I cant remove a directory from url removal tool as the reuglar content is still there..
If I ban it in robots.txt those urls, wont it never try to index them again and thus not ever discover they are 404ing?
These urls are no longer linked to from anywhere, so how can google ever reach them by crawling to find them 404ing?
-
yes
-
Hi thanks, so if it cant find a page and finds no more links to a page. does that mean that it should drop out of the index within a month?
-
The definition of a 404 page is a page which cannot be found. So in that sense, no Google can't find the page.
Google's crawlers follow links. If there is not a link to the page, then there is no issue. If Google locates a link, they will attempt to follow that link.
-
Hi Thanks, so if a page is 404'ing but not linked to from anywhere google will still find it?
-
Hi Adam.
The preferred method to handle this issue would have been to only offer one version of the URL. Once you realized the other versions were active, you have a couple options to deal with the problem:
Use a 301 to redirect all the versions of the page to the main URL. This method would have allowed your existing Google links to work. Users would still find the correct page. Google would have noticed the 301 and adjusted their links.
Another option to consider IF the pages were helpful would be to keep them and use the canonical tag to indicate the URL of the primary page. This method would offer the same advantages mentioned above.
By removing the pages and allowing them to 404, everyone loses for the next month. Users who click on a search result will be taken to a 404 page rather then finding the content they seek. Google wont be offering the search results users are seeking. You will experience a high bounce rate as many users do not like 404 pages, and it will take a month for an average site to be fully crawled and the issue corrected.
If you block the pages in robots.txt, then Google wont attempt to crawl the links. In general, your robots.txt should not be used in this manner.
My recommendation is to fix this issue either with the proper 301s. If that is not an option, be sure your 404 page is helpful and as user friendly as possible. Include a site search option along with your main navigation. Google will crawl a small percent of your site each day. You will notice the number of 404 links diminish over time.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My brand name has 2 words but Google only indexing as 1 word. Is there a fix?
Hi all...I'm at a loss. I've never had this happen. Google only shows pages of my site when I search the brand name as one word. When I Google the site as one word BrandBrand- it only shows my blog page and about us page plus Twitter and Facebook on page 1. The homepage does not show up at all. When I Google the site as two words Brand Brand - My Facebook page is on page 1 but nothing else. The homepage isn't showing up at all. When I search both words on Bing and Yahoo both are indexing it as two words and shows on page 1. Any ideas?
Technical SEO | | TexasBlogger0 -
Google Listing Brand Name as Page Title rather than actual set Title
Any search result including our website is displaying our search result title like this: "Brand Name: Partial Page Title" instead of what it should be "Page Title" (as an example). Where is the "Brand Name:" coming from? We've verified we don't have any code that could cause this, and we're blocking in robots.txt directory listings being used for our search result title/meta. This isn't happening for our competitors. Any ideas on why this is, and if it's something we can control?
Technical SEO | | Closetstogo0 -
Google sees 2 home pages while I only have 1
How to solve the problem of google seeing both domain.com and domain.com/index.htm when I only have one file? Will the cannonical work? If so which? Or any other solutions for a novice? I learned from previous blogs that it needs to be done by hosting service, but Yahoo has no solution.
Technical SEO | | Kurtyj0 -
301 Redirect Clarification: Images, Paramter URLs, etc.
I know that going through a site redesign it's essential to make sure that 301s are implemented for any changed URLs, but I wasn't sure if this was the same for the images on the page and the parameter URLs that are created by marketing campaigns - do those URLs also need to be 301 redirected? For example, this URL: www.mysite.com/32-inch-round-aluminum-table/ Could have a parameter at: www.mysite.com/32-inch-round-aluminum-table/?utm_source=twitterfeed&utm_medium=twitter&utm_campaign=Social%3A+My_Site And an image at: www.mysite.com/images/32-inch-round-aluminum-table.jpg Would the first two URLs mentioned need to be redirected to the new URL, and the image redirected to the new image URL? Thanks for the help.
Technical SEO | | eTundra0 -
Why am I seeing %%name%% showing in the duplicate titles report when it shows the name correctly in the source code?
Crawl diagnostics is picking up all the Wordpress variable tags including and not limited to %%name%% instead of what is actually showing in the source code. Shouldn't it show what is rendered in the browser? I don't think these need to be fixed because they show in Google ok. Search Google for: site:blog.sandiego.org "About Aki"
Technical SEO | | SDConvis0 -
URL Folders and Naming Convention Changes?
1. We’re looking for some clarification in regards to our URL structure. Currently, at our product level we have http://www.ties.com/v/a/elite-solid-black-black-tie however the parent URL is http://www.ties.com/black-ties. a. So here are the question. How much is this hurting because semantically the naming convention of this URL and weird and doesn’t follow logical patterns. In other words. Should the product page for this be http://ties.com/black-ties/elite-solid-black-tie. How bad is this hurting us? b. If we were to change the ULR structure, should we do it in phases or all at once? We don’t want to get penalized. We have well over 3,000 product pages.
Technical SEO | | Ties.com0 -
How to remove a sub domain from Google Index!
Hello, I have a website having many subdomains having same copy of content i think its harming my SEO for that site since abc and xyz sub domains do have same contents. Thus i require to know i have already deleted required subdomain DNS RECORDS now how to have those pages removed from Google index as well ? The DNS Records no more exists for those subdomains already.
Technical SEO | | anand20100 -
How do I get Google to display categories instead of the URL in results?
I've seen that for some domains Google will show a nice clickable site heirarchy in place of the actual URL of a search result. See attached for an example. How do I go about achieving this type of results? categorized.png
Technical SEO | | Carlito-2569610