How to detect where Google gets indexed URL's
-
Google index some kind of way some links that create duplicate content. We doesn't understand how these are created so we would like detect where Google robots find these links.
We tried:
- Moz Crawl Diagnostics but it shows 0 as Internal Link Count for these kind of links.
- Find some information from Google Analytics, that maybe there is trace (site content - all content) from visitors side. There wan't.
- We tried to find some information in Webmaster Tools under Internal link and HTML Improvements but didn't find any trace.
- Tried some search commands. Is there maybe some good one to search.
- TO search URL's form code with https://search.nerdydata.com.
-
It really isn't possible for an outsider to know why your website is generating those URLs in error; you would have to talk to your developer about that.
As far as canonicals, if your problem is page.com is getting duplicated by added parameters: page.com/?id=1, page.com/?id=2, page.com/?id=3, etc. as long as you have the canonical on page.com, all of the parameter pages will have the correct canonical on them as well. (But you are right, you should track down the source; your developer will know.)
-
Thanks you for your answer but yes I know that these are generated by our site. But problem is that I can use canonical tag for these that are indexed right now but later new ones will be created someway. Problem root isn't that we doesn't know how to use canonical, it's how to get to know where these URL's are find/indexed/detected by Google.
These kind of URL's have been there for months so we can't just hope that somehow these will be droped. We need to find some kind of solution and detect real problem.
-
If you found those URLs by doing a site: search, then those parameters are being generated by your site. (I am surprised that Google is even indexing them; I assume that pretty soon all but one will be dropped.) Here is an article that explains more about those types of duplicate pages: http://moz.com/blog/which-page-is-canonical
You can fix this by using a canonical tag on your homepage with the version that doesn't have the parameter.
-
Our front page has almost 50 duplicate versions. These are shown when we do site:oursite.com, there are /et?id=xx, /et?productId=xx, etc. In URL xx are different numbers.
-
Where are you seeing these duplicate content links? Does Webmaster Tools say that they are duplicate content? Or does this show up in your Moz crawl? What do these URLs look like?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Analytics Tutorials
Hi, I'm trying to beef up my knowledge of google analytics. Can you pelase tell me where I can find some good Google analytics tutorials?
Reporting & Analytics | | corn20150 -
Splitting Google analytics data
Hi Everyone I'm not sure if this can be done but thought i would ask anyway. One of our clients has a website which is a 3 tiered website, basically this means different access levels for different users depending on their ip address. The split is as follows (business 1, business 2 and the general public all see different things and areas within the website) Now what we are wanting to do is essentially split our analytics data for each of the 3 different users on the site, Can this be done through Google analytics? Thanks in advance. ps If im not clear enough let me know and ill try clear it up
Reporting & Analytics | | TheZenAgency0 -
Are RSS fees tracked by Google Analytics?
In both the new GA and former version, how are RSS feeds handled by Google Analytics. If not, is there an easy means by which to account for this traffic, in order to have a better picture of traffic. Thanks. Alan
Reporting & Analytics | | ahw0 -
Google Analytics Campaigns
I need the help of a smart Mozzer. In Google Analytics: Traffic Sources>Sources>Campaigns all the results shown are from RSS. Can anyone help me with why RSS results would be displayed in Campaigns?
Reporting & Analytics | | waynekolenchuk0 -
No Social Sources in Google Analytics - what am I doing wrong?
Hello Everyone, I'm having a strange issue: I DO NOT have in my Google Analytics the "Social" tab under the Traffic Sources category. Look at the first image of this post: http://marketingland.com/google-analytics-social-reports-8138 How do you "get" that to show? Hope somebody has this issue and can help, Thanks a lot, Alex
Reporting & Analytics | | pwpaneuro0 -
Do Google penalise you for having too many 404's?
Hi There I have been doing some work reducing the number of 404's displayed in the Crawl Errors found in Googles Webmaster Tools. We had a lot of products that were no longer available so have now been removed to reduce the number of 404s that had been found. However, there are a number of URLs that have been crawled that do not exist on our website and have been flagged in the list of Crawl Errors. I want to know if Google will penalise us for this, perhaps affecting our quality score or if they can see that this is something out of our control. This site for example: http://sibd.com/com_offers_unique_gifts.html has generated a lot of truncated URLs on its site that link to pages that don't exist on our site: http://www.arenaflowers.com/flowers/pri… That is the exact link that it is trying to locate. Here is the report for that particular link. As you can see the content has been scraped by other sites which has spread the problem further. Pages that link to http://www.arenaflowers.com/flowers/pri.. URL Discovery Date
Reporting & Analytics | | ArenaFlowers.com
http://www.justsearchit.com.au/for_flowers_offers,3.html
Sep 12, 2011
http://sibd.com/offers_unique_gifts_for.html
Sep 11, 2011
http://sibd.com/offers_unique_gifts.html
Sep 11, 2011
http://sibd.com/com_offers_unique_gifts_for.html
Sep 11, 2011
http://www.flexfinder.com/flowers_offers_unique_gifts.html
Sep 10, 2011
http://sibd.com/offers_unique_gifts_with.html
Sep 10, 2011
http://sibd.com/com_offers_unique_gifts.html
Sep 10, 2011
http://sibd.com/of_flowers_offers_from.html
Sep 9, 2011
http://arama.frmpc.com/for_flowers_for_less_ltd_includes.html
Sep 9, 2011
http://arama.frmpc.com/flowers_for_less_than_do.html
Sep 9, 2011 I have spotted a lot of these and currently have around 3.3K 404s in total, a majority are from sites we don't control. Is there an acceptable number of 404s a site should aim for and is the above something we should address or are Google smart enough to work out that we can't fix this ourselves? Thanks! Sam.0 -
Google Shopping & Keyword Tracking
When tracking Google Shopping using the URL builder is it possible to pull through the Keyword used or will analytic do this automatically?
Reporting & Analytics | | TPSUK1 -
Google and bing search filed commands
Dose someone have / know a full list / resource with commands for google and bing ? Including filters for those commands ? (site:domain.com -filter etc) (like: site:domain.com, link:domain.com etc) I use the basic ones b ut I know there are much more and that there are several filters that can be used with success to filter down results. Thanks.
Reporting & Analytics | | eyepaq1