Magento Core_URL_Rewrite Problems
-
Hi Everyone,
We are currently caught between a rock and a hard place with Magento and are wondering if anyone else had similar problems and could share their advice.
Our Core_URL_Rewrite now containt 1.3 million records for an account that has 12000 products on 4 different store views. This has ballooned past the point that we are no longer able to reindex our URL Management.
The option that is being suggested to us is to truncate the table and start over, though this will essentially kill our SEO for those pages.(Which as there are duplicates, I can only imagine how much they are going to be penalized by it)
Would anyone have any advice other than truncating and starting over?
Any advice would be greatly appreciated.
Thanks!
-
Hi,
I found the exact problem you are facing with a solution on this link
http://magento.stackexchange.com/questions/17553/magento-core-url-rewrite-table-excessively-large
There are patch codes available on this link, however do read this reply on this page
Bugs in earlier (and possibly current) versions of Magento is one. Another is there's logic in this table that tries to track changes to the URL key value so that 301/302 rewrites are setup for old products. Because of this, and complicating things, truncating the table and regenerating may make existing URL rewrites go away, and this will have an unknown effect on your search engine listing (not necessity bad, just hard to predict).
My general advice to clients who ask is
-
Leave the giant growing table as is if you don't have a good handle on your URL/SEO situation
-
Until the table size starts being a problem (generating site maps, for example). When that happens, get a handle on your URL/SEO situation.
-
Once you have a handle on your URL/SEO situation, backup the table, then truncate the table and regenerate. Address any URL/SEO problems caused by the truncating.
-
Automate step 3
Trying to fix this on the Magento code level is admirable, but you'll be swimming upstream. Sometimes it's better to accept that "That's just Magento being Magento", and to solve the problem with and external process.
I hope this helps, if you have further questions, then post a response, I will be happy to answer.
Regards,
Vijay
-
-
I'm not sure the answers previously presented are related to the issues you're having. Having worked with Magento for a long time, this can be an issue that occurs over and over again.
To answer your initial question, truncating your core_url_rewrite table will remove all of these URLs, but it'll only delay the problem until it reoccurs again in the future (unless you've had a problem in the past which has been rectified). You're also correct in that any rewrites in the system previously there will disappear, so you'll probably end up with a lot of crawl issues appearing in Search Console.
Your best move would be to find out why you have so many URLs in there in the first place. Do you have a huge product catalog with multiple stores? Or is this something to do with an issue in your Magento version or some setup issues. The most common time this usually occurs is if two products get added to your site with the same URL Key. Every time the reindex process runs, your core_url_rewrite table will grow. You could check this by looking at the number of rows in the table, reindexing the site and if it grows further, then it's likely to be the problem. The quickest way to fix this is to ensure all URL key are unique.
There's also an article here about duplicate keys - https://firebearstudio.com/blog/magento-url-reindex-core_url_rewrite-duplicates-patch.html - this should hopefully clear the issue.
I hope this helps! If it doesn't solve the problem, then sending over a little more information around the number of stores, catalog site and the split between system generated URL rewrites and custom URL rewrites would be great so we can try to help further!
Thanks,
Lewis -
This is an issue to to set-up. When you set up multiple ecommerce websites on Magento as 'Stores', then all SKUs will load on other domains. if they were set-up as 'Websites' then this would alleviate the issue. However, with Stores you are able to share shopping carts (i.e. Add a product from website A and checkout on website B).
What I did was turn off the XML cron jobs and set-up cross-domain canonicals. Also make sure your session IDs (/?SID=) are working properly. Not sure if this solves the technical issues, but should help clear up dupe content.
-
Is it creating a new url for each option (size, color, etc) as well as what page it shows up on or other various sort orders (by price, by size, etc.) and session id's that you could exclude? Are you sure they are truly duplicates?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Old product URLs still indexed and maybe causing problems?
Hi all, Need some expertise here: We recently (3 months ago) launched a newly updated site with the same domain. We also added an SSL and dropped the www (with proper redirects). We went from http://www.mysite.com to https://mysite.com. I joined the company about a week after launch of the new site. All pages I want indexed are indexed, on the sitemap and submitted (submitted in July but processes regularly). When I check site:mysite.com everything is there, but so are pages from the old site that are not on the sitemap. These do have 301 redirects. I am finding our non-product pages are ranking with no problem (including category pages) but our product pages are not, unless I type in the title almost exactly. We 301 redirected all old urls to new comparable product, or if the product is not available anymore to the home page. For better or worse, as it turns out and prior to my arrival, in building the new site the team copied much of the content (descriptions, reviews, etc) from the old site to create the new product pages. After some frustration and research I am finding the old pages are still indexed and possibly causing a duplicate content issue. Now, I gather there is supposedly no "penalty", per se, for duplicate content but a page or site will simply not show in the SERPs. Understandable and this seems to be the case. We also sell a lot of product wholesale and it turns out many dealers are using the same descriptions we have (and have had) on our site. Some are much larger than us so I'd expect to be pushed down a bit but we don't even show in the top 10 pages...for our own product. How long will it take for Google to drop the old and rank the new as unique? I have re-written some pages but much is technical specifications and tough to paraphrase or re-write. I know I could do this in Search Console but I don't have access to the old site any longer. Should I remove the 301s a few at a time and see if the old get dropped faster? Maybe just re-write ALL the content? Wait? As a site note, I'm also on a Drupal CMS with a Shopify ecommerce module so maybe the shop.mysite.com vs mysite.com is throwing it off with the products(?) - (again the Drupal non-product AND category pages rank fine). Thoughts on this would be much appreciated. Thx so much!
Intermediate & Advanced SEO | | mcampanaro0 -
Sitemap Indexed Pages, Google Glitch or Problem With Site?
Hello, I have a quick question about our Sitemap Web Pages Indexed status in Google Search Console. Because of the drastic drop I can't tell if this is a glitch or a serious issue. When you look at the attached image you can see that under Sitemaps Web Pages Indexed has dropped suddenly on 3/12/17 from 6029 to 540. Our Index status shows 7K+ indexed. Other than product updates/additions and homepage layout updates there have been no significant changes to this website. If it helps we are operating on the Volusion platform. Thanks for your help! -Ryan rou1zMs
Intermediate & Advanced SEO | | rrhansen0 -
What are Soft 404's and are they a problem
Hi, I have some old pages that were coming up in google WMT as a 404. These had links into them so i thought i'd do a 301 back to either the home page or to a relevant category or page. However these are now listed in WMT as soft 404's. I'm not sure what this means and whether google is saying it doesn't like this? Any advice welcomed.
Intermediate & Advanced SEO | | Aikijeff0 -
Preserving URL Structure from Os Commerce to Magento
I have a website that is built on OS Commerce and I am planning to transition to Magento. I was told that the transition to Magento would change my url structure. How do I preserve my current url structure while migrating to the Magento platform so that I do not lose my backlink profile.
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
Duplicate Title - Magento Products / Kunena Forum - Nofollow vs. Follow
Hello Mozzers! This is my first post so hopefully I can explain this in a way that is clear! I just signed up for my PRO membership and wanted to finally start hacking away at this companies site that I work for. I ran my report and noticed my Crawl Diagnostics: 715 Dupliate Page Titles I noticed that a lot those were coming from the customer login in magento. So I added the: User-agent: *Disallow:/customer/ as mentioned in this link: http://stackoverflow.com/questions/10742052/duplicate-content-issues-for-login-page Any suggestions on that? Also, We have a Kunena forum on our site and have noticed all of the nofollow links. People say I should change them to follow links, but other people say leave them or I will get penalized. Any advice would be greatly appreciated!
Intermediate & Advanced SEO | | NRMWEBWORKS0 -
Penguin Recovery Problem - Weird
I had an old URL and the link profile of this URL wasn't good - I had been using article syndication and Penguin threw me to the wolves. I decided to start over with a new URL and build a new natural link profile. I specifically did NOT do a 301 redirect to the new URL and did not make any request to Google to transfer domain as I didn't want old site being associated to the new one. To redirect our old users, I put a link on the old URL index page (nofollowed) that say that we have moved. I was very surprised to find that in GWT all the links of the old URL have now been associated to the new URL....why is that? I started over to have a clean natural profile and follow Google guidelines.Has anyone heard of this before? All I can guess is that Google itself "decided" to do its own pseudo-301, since the site was the same, page for page.This has Major implications for anyone attempting a "clean start" to recover from Penguin.
Intermediate & Advanced SEO | | veezer0 -
Magento Hidden Products & Google Not Found Errors
We recently moved our website over to the Magento eCommerce platform. Magento has functionality to make certain items not visible individually so you can, for example, take 6 products and turn it into 1 product where a customer can choose their options. You then hide all the individual products, leaving only that one product visible on the site and reducing duplicate content issues. We did this. It works great and the individual products don't show up in our site map, which is what we'd like. However, Google Webmaster Tools has all of these individual product URLs in its Not Found Crawl Errors. ! For example: White t-shirt URL: /white-t-shirt Red t-shirt URL: /red-t-shirt Blue t-shirt URL: /blue-t-shirt All of those are not visible on the site and the URLs do not appear in our site map. But they are all showing up in Google Webmaster Tools. Configurable t-shirt URL: /t-shirt This product is the only one visible on the site, does appear on the site map, and shows up in Google Webmaster Tools as a valid URL. ! Do you know how it found the individual products if it isn't in the site map and they aren't visible on the website? And how important do you think it is that we fix all of these hundreds of Not Found errors to point to the single visible product on the site? I would think it is fairly important, but don't want to spend a week of man power on it if the returns would be minimal. Thanks so much for any input!
Intermediate & Advanced SEO | | Marketing.SCG0 -
Problem of indexing
Hello, sorry, I'm French and my English is not necessarily correct. I have a problem indexing in Google. Only the home page is referenced: http://bit.ly/yKP4nD. I am looking for several days but I do not understand why. I looked at: The robots.txt file is ok The sitemap, although it is in ASP, is valid with Google No spam, no hidden text I made a request for reconsideration via Google Webmaster Tools and it has no penalties We do not have noindex So I'm stuck and I'd like your opinion. thank you very much A.
Intermediate & Advanced SEO | | android_lyon0