URL Parameters
-
Hi Moz Community,
I'm working on a website that has URL parameters. After crawling the site, I've implemented canonical tags to all these URLs to prevent them from getting indexed by Google. However, today I've found out that Google has indexed plenty of URL parameters..
1-Some of these URLs has canonical tags yet they are still indexed and live.
2- Some can't be discovered through site crawling and they are result in 5xx server error.
Is there anything else that I can do (other than adding canonical tags) + how can I discover URL parameters indexed but not visible through site crawling?
Thanks in advance!
-
I'm also facing the same problem with my website pages. My Blackpods pro website pages don't show the exact permalink urls.
-
Hi there,
Thanks very much for your response. I checked the sitemap and there are no URL parameters listed - only the canonical URL listed on the sitemap.
If you have any other suggestions it'll be much appreciated.
Thank you!
-
Hi Rajesh,
Thank you for your response. I cannot share the website due to client's confidentiality but basically when I search to find a stockist {brand name}, Google lists similar URLs below on the first page. The pages are showing a list of stockists depending on the product availability:
1-website.com/find-stockist?model=10 (5xx status code)
2-website.com/find-stockist?model=11 (200 status code)
3-website.com/find-stockist?model=10 (5xx status code)
4-website.com/find-stockist?model=11 (200 status code)Thank you!
-
Hi Gaston,
Thanks very much for your time. The canonicals have implemented around a month ago and the pages are almost identical. I discovered all URL parameters without performing an advanced search.
Also, I come across the 5xx errors when I clicked indexed URL parameters on Google SERP and I cannot discover them when I crawl the site with Screaming Frog.
I'd appreciate if you have any other suggestions based on your experience!
Many thanks
-
Just so you know, if a URL results in a 5XX server error then it usually won't render your canonical tag to begin with! You might want to check your sitemap XML, to check that it's not 'undoing' your canonical tags by feeding these URLs to Google. Indexation tags must be perfectly aligned with your sitemap XML, or you are sending Google mixed messages (e.g: a URL is in sitemap XML so Google should index it, but when it is crawled it contains a canonical tag citing itself as non-canonical, which is the opposite signal)
Everything which Gaston said is right on the money
-
I think you need to show some examples.
-
Hi there,
Its important to note that canonicals are a signal. Google can obey them if its algorithm considers that those pages are actually canonicals between each other.
In my experience, this does not happen immediately, it usually takes Google some time to figure out if the canonicalization is correct. Keep in mind that pages being canonicalized HAVE TO be nearly identical and refer to the same topic.
And on the indexation part, pages can be indexed and be shown only when you search for that specific URL or using any advanced search parameter (such as site:).
More information about canonicals
- Consolidate duplicate URLs - Google Search supportRegarding the second issue, if you refer to "site crawling" as what you do with an external tool, such as Screaming Frog or Moz, you are getting 5xx errors because that tool is making to many requests, try lowering its crawl frequency. I know for a fact that Screaming Frog allows you to do that.
But, unfortunately, I don't know any other way of discovering URL parameters in bulk but using an external tool.Hope it helps,
Best luck.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I treat URLs with bookmarks when migrating a site?
I'm migrating an old website into a new one, and have several pages that have bookmarks on them. Do I need to redirect those? or how should they be treated? For example, both https://www.tnscanada.ca/our-expertise.html and https://www.tnscanada.ca/our-expertise.html#auto resolve .
Intermediate & Advanced SEO | | NatalieB_Kantar0 -
What would cause these ⠃︲蝞韤諫䴴SPপ� emblems in my urls?
In Search Console I am getting errors under other. It is showing urls that have this format- https://www.site.com/Item/654321~SURE⠃︲蝞韤諫䴴SPপ�.htm When clicked it shows 蝞韤諫䴴SPপ� instead of the % stuff. As you can see this is an item page and the normal item page pulls up fine with no issues. This doesn't show it is linked from anywhere. Why would google pull this url? It doesn't exist on the site anywhere. It is a custom asp.net site. This started happening in mid May but we didn't make any changes then.
Intermediate & Advanced SEO | | EcommerceSite0 -
URL Parameter Being Improperly Crawled & Indexed by Google
Hi All, We just discovered that Google is indexing a subset of our URL’s embedded with our analytics tracking parameter. For the search “dresses” we are appearing in position 11 (page 2, rank 1) with the following URL: www.anthropologie.com/anthro/category/dresses/clothes-dresses.jsp?cm_mmc=Email--Anthro_12--070612_Dress_Anthro-_-shop You’ll note that “cm_mmc=Email” is appended. This is causing our analytics (CoreMetrics) to mis-attribute this traffic and revenue to Email vs. SEO. A few questions: 1) Why is this happening? This is an email from June 2012 and we don’t have an email specific landing page embedded with this parameter. Somehow Google found and indexed this page with these tracking parameters. Has anyone else seen something similar happening?
Intermediate & Advanced SEO | | kevin_reyes
2) What is the recommended method of “politely” telling Google to index the version without the tracking parameters? Some thoughts on this:
a. Implement a self-referencing canonical on the page.
- This is done, but we have some technical issues with the canonical due to our ecommerce platform (ATG). Even though page source code looks correct, Googlebot is seeing the canonical with a JSession ID.
b. Resubmit both URL’s in WMT Fetch feature hoping that Google recognizes the canonical.
- We did this, but given the canonical issue it won’t be effective until we can fix it.
c. URL handling change in WMT
- We made this change, but it didn’t seem to fix the problem
d. 301 or No Index the version with the email tracking parameters
- This seems drastic and I’m concerned that we’d lose ranking on this very strategic keyword Thoughts? Thanks in advance, Kevin0 -
Google tagged URL an overly-dynamic URL?
I'm reviewing my campaign, and spotted the overly-dynamic URL box showing a few links. Reviewing it, they are my Google Tagged URLs (utm_source, utm_medium_utm_campaign etc) I've turned some internal links to Google Tagged URLs but should these cause concern?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Changing a url from .html to .com
Hello, I have a client that has a site with a .html plugin and I have read that its best to not have this. We currently have pages ranking with this .html plug in. However If we take the plug in out will we lose rankings? would we need a 301 or something?
Intermediate & Advanced SEO | | SEODinosaur0 -
How important is it to clarify URL parameters?
We have a long list of URL parameters in our Google Webmasters account. Currently, the majority are set to 'let googlebot decide.' How important is it to specify exactly what googlebot should do? Would you leave these to 'let googlebot decide' or would you specify how googlebot should treat each parameter?
Intermediate & Advanced SEO | | nicole.healthline0 -
Changing URLS - wondering about implications
We are in the process of changing our URLs from dynamic to more SEO friendly. The website is ciee.org and I'm specifically talking about ciee.org/study. While we work with the business to get approval for ciee.org/study-abroad, we are going with ciee.org/study/abroad. Can anyone foresee any difficulties or negative implications that could come if we change from study/abroad to study-abroad all within 6 months? Thank you in advance!!
Intermediate & Advanced SEO | | CIEEwebTeam0 -
Htaccess Redirect with %C2%A0 in URL
Below is my setup for redirects in .htaccess file in my root word press installation. The www to non-www works well, so no problems there Other page redirects work well, too (example: redirect 301 /some-page/ http://mysite.com/another-page/ (I didn't post those because I have a few too many : ) So here it goes... RewriteEngine On
Intermediate & Advanced SEO | | pepsimoz
RewriteCond %{HTTP_HOST} ^www.mysite.com$ [NC]
RewriteRule ^(.*)$ http://mysite.com/$1 [R=301,L] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress redirect 301 /archives/10-college- majors/ http://mysite.com/archives/10-college-majors/ redirect 301 /archives/10-college-%20majors/ http://mysite.com/archives/10-college-majors/ redirect 301 /archives/10-college-%C2%A0majors/ http://mysite.com/archives/10-college-majors/ I'm having a problem with the last 301 redirect: redirect 301 /archives/10-college-%C2%A0majors/ http://mysite.com/archives/10-college-majors/ not working... As you can see I've tried using other varations of the "space" but no go. I also used a redirect in cPanel's Redirect screen; testing all the possible options + wildcard I've also tried this: http://serverfault.com/questions/201829/using-special-characters-in-apache-mod-rewrite-rule (perhaps unsuccessfully, because it caused a 500 server error and it's a different situation in my case) I also saw something here: http://www.webmasterworld.com/apache/3908682.htm but I don't know if it works and how I would implement that + do so without compromising ALL other redirects. Note: the URL displays with a space in the address bar of all major web browsers: http://mysite.com/10-college- majors/ and goes to a 404 page I have a goregous page / PR6 / high authority site linking to the URL on my site, but they copied the URL with a space somehow. I contacted the person responsible for the website and he claims it works fine (aka he didn't check it). Is there a clean way to redirect ONLY this problematic URL without compromising other redirects, etc? Any ideas would be great. I'll respond with progress. Thanks in advance. UPDATE the redirect works, and it did work. Even so, when looking at source of page linking to mine, the URL looks like this: ``` http://mysite.com/archives/10-college- majors/ Clicking the URL in Source View in FireFox takes me to ``` http://mysite.com/archives/10-college-%C2%A0majors/ none of my 301 redirects should direct there. I don't have any redirect plugins either.0