URL Parameters
-
Hi there, I have a magento sort by feature which has indexed loads of pages in Google with urls that have /shopby/ in them.Over 8k pages have been indexed like this. I cannot edit the robots within the page but have now disallowed the urls in robots.txt - i guess this will prevent new ones being indexed but not deindex current ones?
So I looked into URL parameters, I added 'shopby' as a parameter in webmaster tools and told Google not to crawl any urls with this in it, will this deindex the pages already indexed?
The only other way seems to be manually removing 8k urls, which i do not want to do.
Any advice much appreciated. Obviously I do not want these urls indexed as they are weak/duplicate sort by search pages, I fear the panda update would not be too kind on it long term?
-
That would be correct. What you have are "self referencing" canonical tags. That does the exact opposite of what you need it to do. It tells Google all of those pages are valid, where you need it to tell Google all of those pages are just copies of only ONE valid page.
-
Yes.
The idea of having a canonical is to point it to another page, many just don't get this
-
Hi guys. Well the site has been setup so every page has a unique canonical tag, the canonical tag being the url it is on.
I guess I need to find a way in magento to make all /shopby/ urls have the same canonical tag then it will deindex once Google recrawl?
-
Assuming you have your canonicals done correctly, the pages will disappear in time.
the pages you wont to de-index, should have a canonical tag that points to the original.
-
Hi there, the canonical tags are there but the pages are still indexed.
No links point to these pages, they are just sort by urls being generated off a widget.
-
I would not de index the page either with robots or WMT.
links in your site that point to any of these pages will now pour their link juice into un indexed pages.
use a canonical tag to fix the problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structure for a product that belongs to several categories
Hi, We are setting up the URL structure for a big webshop and this raised the following questions regarding the URL's for products that belong to several categories (most of the products do, so the same approach would be used for products that just belong to one category). There are three options in my point of view: Use root-level product page URLs (limits trackability in Analytics software because you can not specify on product types)
Reporting & Analytics | | Mat_C
URL: example.com/product-1/ Use product type URL directories for all products (which at least offers minimum trackability of all separate product types in Analytics software)
URL: example.com/book/product-1/ Use product URLs built upon category URL structures, but ensure that each product page URL has a single, designated canonical URL.
URL: example.com/category-A/product-1/ with canonical= example.com/category-A/product-1/ URL: example.com/category-B/product-1/ with canonical= example.com/category-A/product-1/ Which option is the preferred one? Thanks!0 -
Search Console Crawl Errors/Not Found - Strange URLs
Hello, In Google Search Console under Crawl > Crawl Errors > Not found I have strange URLs like the following: https://www.domain.com//UbaOZ/
Reporting & Analytics | | chuck-layton
https://www.domain.com//UPhXZ/
https://www.domain.com//KaUpZ/WYdhZ/SnQZZ/MOcUZ/ There is no info in Linked From tab. Have you seen this type of error??
Does anyone know whats causing it??
How should it be fixed?? Thanks for reading and the help!0 -
Canonical Tags & GWT Parameters
A site I'm working on has canonical tags which I find to be accurate, regardless of tracking parameters or anything else added to the url. The tag looks like: And we have alot of parameters in Google Search Console that look like Parameter Crawl page Let Googlebot Decide destination Let Googlebot Decide filters Let Googlebot Decide Since all of our parameters follow a question mark, like http://www.examplesite.com/questions/avocados?source=ad12345 and all of our pages have canonical tags showing the representative url without the additional parameters, why wouldn't we just have the one parameter in GWT as Parameter Crawl ? Representative URL I ask because I find that Google analytics shows pages with parameters as landing pages in search, which has me concerned about Google seeing it as duplicate content. Thanks! Best... Darcy
Reporting & Analytics | | 945010 -
Regex Filter To Exlude lower case urls
Buon Pormeriggio from Wetherby 22 degrees the summer continues! I need to set up a regex filter to knock out lowecase versions of http://www.sandersonweatherall.co.uk/Sales/ Thing is Analytics is returning this lowercase version which i want to regex filter out.So if Regex filter /Sales/$ returns what i eant how do i knock out urls beginning with lowe case s. Grazie,
Reporting & Analytics | | Nightwing
David0 -
Google News traffic spike mystery; referring URLs all blank, Omniture tags didn't fire.
Our content is occasionally featured in Google News. We recently have had two episodes where this happened, but (a) nearly all the referring URLs were blank, and (b) our backend logs show 3-4x more requests for the article in question than Omniture does. In other words, hundreds of thousands of visitors requested a URL from our site (as proven by the traffic logs), but don't seem to have come from Google News (because HTTP_REFERER was blank), and didn't execute the onpage javascript tag to notify Omniture of the pageview. Perhaps this has nothing to do with Google News, but it is too strong a coincidence that the two times we were on there recently, the same thing happened: big backend traffic spike that is not seen by Omniture. It is as if Google News causes browsers to pre-fetch our article without executing the javascript on the page. And without sending a referring URL. Has anyone else seen anything like this before? Stats from the recent episode:
Reporting & Analytics | | mcglynn
- 835,000 HTTP requests for the article URL (logged by our servers) - these requests came from 280,000 distinct IP addresses (70% US) - the #1 referring URL is blank. This accounts for 99.4% of requests. Which, in itself, is hard to believe. These people had to come from somewhere. I believe browsers don't pass HTTP_REFERER when you click from an SSL page to a non-SSL page, but I think Google News doesn't bounce users to SSL by default.That said, we do see other content pages with 70-90% blank referring URLs. Rarely 99+% though.0 -
My first campaign identidied long URLs
Hello! 🙂 I've just created my first campaign, and the crawling proccess have detected posts with long URL (more than 70 characters). If I change it, i.e., alter the URL's, can some problem happens to my blog? Or do I have to disconsider this problem and just "work correctly" from now on? Thanks in advance for your help!
Reporting & Analytics | | Andarilho0 -
Count of words in all urls in a subdomian
Hi, I am trying to understand in a simple way how much content (words per url) are included in all urls under a subdomain. Is there a way to get this information from any of the tools ? thanks!
Reporting & Analytics | | picolo0 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0