Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Removing UpperCase URLs from Indexing
-
This search - site:www.qjamba.com/online-savings/automotix
gives me this result from Google:
Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products.and Google tells me there is another one, which is 'very simliar'. When I click to see it I get:
Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/Automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products.This is because I recently changed my program to redirect all urls with uppercase in them to lower case, as it appears that all lowercase is strongly recommended.
I assume that having 2 indexed urls for the same content dilutes link juice. Can I safely remove all of my UpperCase indexed pages from Google without it affecting the indexing of the lower case urls? And if, so what is the best way -- there are thousands.
-
Hi AMHC,
It makes sense that without hardly any backlinks built up Google wont find my upper case URLS since all the page links have been changed, however, I am writing out all of the urls that are redirected into email, and from that I can tell that Google is finding them--I guess they may have a list of urls from prior indexing that they crawl independent of what their crawler comes up with.
I'll keep looking to see what they have indexed and if it turns out they just aren't crawling certain pages, will put them in a sitemap to be crawled..It's a good idea for taking care of the problem quickly--so if it progresses too slowly I'll do that.
Thanks very much for your answers!
-
Google needs to crawl the bad pages that you 301d. If there are no live links to those pages, then Google can't find them to 301. In short, if you created new lower case URLs, you just increased your duplicate content problem.
To solve this problem, build an HTML sitemap with all of the bad URLs. Have Google fetch and submit the page and all of the pages it links to. Google will crawl all of your old pages and apply the 301s.
-
Thanks AMHC. In my case, I just don't have many back links so I don't have the urgency that you faced with getting Google to see all the redirects. But, I'm still not understanding--it sounds like you believe that once google sees the redirect it removes the old uppercase from its index. It doesn't look to me like that is what happened in my case because Google is currently indexing BOTH, and so that means it has crawled my new lowercase and I know it isn't crawling any uppercase anymore (it cant--all are redirected). So, that's why I wonder if I have to remove those uppercase urls...does that make sense or am I just not understanding it still?
EDIT: I just discovered I wasn't doing a 301 direct so it wasn't considered a permanent move. That, if I understand it right, will remove the upper case from googles index permanently.
-
Canonicals still drain link juice. Canonicals aren't like a 301. The link juice still stays on the canocalized page. All a canonical does is tell Google, in the case of duplicate content, which page is primary. Canonicals handle the duplicate content issue, they do not handle the link juice issue. If I have 2 pages: /product-name/ and /product-name=?khdfpohfo/ that are duplicates, you can via canonical, tell Google to ignore the page with the variable string and rank the page without the variable string. If the page with the variable string has links, the link juice stays on the page.
The HTML Sitemap is there to tell Google about the 301s. the sitemap would look like this:
After you do the 301 redirect, as well as set up parameters in the .htaccess file (I think - not the developer on this), everything should redirect to the lower case URL. The problem is that if you do a 301 redirect for your entire site, Google may not figure it out too quickly. When it crawls your home page downward, it's only going to see the new URLs, and can't crawl the old 301 URLs because there aren't any internal links pointing at them. The only way Google will see the 301 is via an external backlink. The way we solved this was to create an HTML sitemap of all of the old upper case URLs. We then had Google fetch and index/crawl the sitemap. As it crawls the sitemap, where all of the URLs are 301 redirects, it will likewise point all of the Link Juice at the new URLs.
-
I gotcha. Yeah, different thing going on here..these urls can be really difficult! I have uppercase lowercase, https http, urls that have different content(not just formatting) for mobile as desktop and vice versa, mobile urls that dont even exist for desktop, and desktop urls that dont exist for mobile..all under the same domain. 1000s of internal pages....In the desire to create a good website for users I've created an SEO monster because I didn't realize the many consequences with regard to search indexes.
If you know a true expert in these areas I need him/her. 4 years on this site, its live finally (2 months), and now I'm discovering all of these things have to be fixed, but i can't afford thousands of dollars..I'll do the work, I just need the knowledge!
-
I see where you are coming from, and I do not have a good answer then, when I did a lowercase redirect I started by creating the new lowercase pages then setting canonical to them. After a few months I removed the uppercase versions and redirected them to the new lowercase.
-
Hutch, thanks.
The site is dynamic with thousands of pages that are now being redirected to lower case, so I'm not seeing how using canonical would work because the upper case urls aren't on the site anymore. I guess I think of canonical as being useful when you have ongoing content on the site that duplicates one or more other pages on the same site. In my case none of the upper case urls exist anymore so they don't have 'ongoing' content. I'm still new to this so if it sounds like I have it wrong, please correct me.
-
Another quick fix would be to use a canonical tag on all of your pages pointing to the full lowercase versions.
So for the URLs example.com/UPPER; example.com/Upper; and example.com/upper you would place the following into the head so Google knows that these are just variations of the same page, and if will point search to the desired page example.com/upper
-
AMHC, thank you for your response. I'm in the middle of quite a mess, as this is one of several issues, so really appreciate your help. I must confess to not following everything you wrote exactly:
In your situation, I think i understand the redirect -- it is the same reason I am doing a redirect--it is so that anyone coming from to this site with uppercase in it will end up on the lower case page, and in the case of google will then index the page as a lower case page. BTW, for me that has been easy as I am doing it via php -- if the url doesn't equal its strtolower of the url , then I redirect to strtolower.
I think I get what you are saying about the sitemap -- it speeds up google crawling the site and seeing that all those upper cases should be lowercase from your redirect. In my case, i don't have the concern about Google discovering them as you did because my site is only a couple months old. And, I never have given Google a sitemap so many of my pages aren't crawled yet (I am trying to clean up my entire url structure before i submit a sitemap to them--however they have already crawled perhaps 20% of the site, so I'm now trying to examine what google has crawled and how it has been indexed to figure out what needs to be done).
What I'm not understanding is this: It seems to me that what you described should succeed for going forward to getting both Google and your users to the right ending page, but I don't see how it removes the prior uppercase urls from Google's index. What is it that tells Google your prior upper case urls should no longer be in their index? Is it the fact that they aren't in the sitemap you provide now? Or, do they literally have to be removed using some kind of removal or disavow tool? I discovered this (as you see in the op) because Google appears to never have removed the Uppercase ones even though they are indexing the lower case now.
Ted
-
We had the same issue. Boy, was it an education. I had no idea that URLs were case sensitive for Google, and neither did my SEO buddies. I bet if you asked 100 SEOs if URLs were case sensitive for Google, 95 would answer "No". We discovered the problem in GWT and GA when they had different statistics for the mixed case and all lower case versions of the URL. We believed that we had both a duplicate content issue as well as a link juice splitting issue, with backlinks being pointed at both URLs.
We solved the problem by doing a 301 redirect, but as we are an ecommerce site with thousands of products, it was a messy process. We had to redirect pretty much every page on the site since the mixed case categories contaminated subcategories and products.
The 301 went pretty smoothly, and we saw a minor bump up in some of our Rankings. I would strongly suggest that you create an HTML sitemap for every upper case URL that you are going to 301. Here were our thoughts - we could be wrong on this. If we just 301 a page, and don't tell Google, then Google won't know about it unless it tries to crawl the page. We felt like we needed to show Google that all of the pages are being redirected asap. Create an HTML sitemap with all of your upper case URLs. After you do the 301, have Google fetch and index the sitemap page and all of the pages that it links to. Leave the map up for a few days, and then you can take it down. This will expedite moving the link juice to the correct pages as Google will index the 301 for every page in the sitemap.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Changing Url Removes Backlink
Hello MOZ Community, I have question regarding Bad Backlink Removal. My Site's Post's Image got 4 to 5k backlinks from unknown sites and also their is no contact details on their site so that i can contact them to remove. So, I have an idea for which i want suggestion " If I change the url that receieves backlinks" does this will remove backlinks? For Example: https://example.com/test/ got 5k backlinks if I change this url to https://examplee.com/test-failed/ does this will remove those 5k backlinks? If not then How Can I remove those Backlinks? I Know about disavow but this takes time.
Intermediate & Advanced SEO | | Jackson210 -
How can I make a list of all URLs indexed by Google?
I started working for this eCommerce site 2 months ago, and my SEO site audit revealed a massive spider trap. The site should have been 3500-ish pages, but Google has over 30K pages in its index. I'm trying to find a effective way of making a list of all URLs indexed by Google. Anyone? (I basically want to build a sitemap with all the indexed spider trap URLs, then set up 301 on those, then ping Google with the "defective" sitemap so they can see what the site really looks like and remove those URLs, shrinking the site back to around 3500 pages)
Intermediate & Advanced SEO | | Bryggselv.no0 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
Strange URLs, how do I fix this?
I've just check Majestic and have seen around 50 links coming from one of my other sites. The links all look like this: http://www.dwww.mysite.com
Intermediate & Advanced SEO | | JohnPeters
http://www.eee.mysite.com
http://www.w.mysite.com The site these links are coming from is a html site. Any ideas whats going on or a way to get rid of these urls? When I visit the strange URLs such as http://www.dwww.mysite.com, it shows the home page of http://www.mysite.com. Is there a way to redirect anything like this back to the home page?0 -
XML Sitemap index within a XML sitemaps index
We have a similar problem to http://www.seomoz.org/q/can-a-xml-sitemap-index-point-to-other-sitemaps-indexes Can a XML sitemap index point to other sitemaps indexes? According to the "Unique Doll Clothing" example on this link, it seems possible http://www.seomoz.org/blog/multiple-xml-sitemaps-increased-indexation-and-traffic Can someone share an XML Sitemap index within a XML sitemaps index example? We are looking for the format to implement the same on our website.
Intermediate & Advanced SEO | | Lakshdeep0 -
Where to put a page ID in a URL?
Hello, My company is going to change URLs to example.com/category or example.com/product. When we will change the URLs to product or category pages somehow we have to check whether the requested page is from category table in DB or from products table (this gives much speed to page load time). So we have to choose how to make the different product and category pages.
Intermediate & Advanced SEO | | komeksimas
Programmers said that we need to insert id to URL. So the question is: Which is the better way to place an id to an URL? example.com/product-name?id=111 example.com/product-name/111 example.com/product_name-111 Or maybe we should use some other punctuation mark to separate id from product name? p.s. I have read Dynamic URLs vs. static URLs by Google and it still didn't answered which is the best for all of the pages. Somehow others solve this problem by typing only the names to the URL, but could anyone tell what that technology should be?0 -
URL Structure for Directory Site
We have a directory that we're building and we're not sure if we should try to make each page an extension of the root domain or utilize sub-directories as users narrow down their selection. What is the best practice here for maximizing your SERP authority? Choice #1 - Hyphenated Architecture (no sub-folders): State Page /state/ City Page /city-state/ Business Page /business-city-state/
Intermediate & Advanced SEO | | knowyourbank
4) Location Page /locationname-city-state/ or.... Choice #2 - Using sub-folders on drill down: State Page /state/ City Page /state/city Business Page /state/city/business/
4) Location Page /locationname-city-state/ Again, just to clarify, I need help in determining what the best methodology is for achieving the greatest SEO benefits. Just by looking it would seem that choice #1 would work better because the URL's are very clear and SEF. But, at the same time it may be less intuitive for search. I'm not sure. What do you think?0