Magento Core_URL_Rewrite Problems
-
Hi Everyone,
We are currently caught between a rock and a hard place with Magento and are wondering if anyone else had similar problems and could share their advice.
Our Core_URL_Rewrite now containt 1.3 million records for an account that has 12000 products on 4 different store views. This has ballooned past the point that we are no longer able to reindex our URL Management.
The option that is being suggested to us is to truncate the table and start over, though this will essentially kill our SEO for those pages.(Which as there are duplicates, I can only imagine how much they are going to be penalized by it)
Would anyone have any advice other than truncating and starting over?
Any advice would be greatly appreciated.
Thanks!
-
Hi,
I found the exact problem you are facing with a solution on this link
http://magento.stackexchange.com/questions/17553/magento-core-url-rewrite-table-excessively-large
There are patch codes available on this link, however do read this reply on this page
Bugs in earlier (and possibly current) versions of Magento is one. Another is there's logic in this table that tries to track changes to the URL key value so that 301/302 rewrites are setup for old products. Because of this, and complicating things, truncating the table and regenerating may make existing URL rewrites go away, and this will have an unknown effect on your search engine listing (not necessity bad, just hard to predict).
My general advice to clients who ask is
-
Leave the giant growing table as is if you don't have a good handle on your URL/SEO situation
-
Until the table size starts being a problem (generating site maps, for example). When that happens, get a handle on your URL/SEO situation.
-
Once you have a handle on your URL/SEO situation, backup the table, then truncate the table and regenerate. Address any URL/SEO problems caused by the truncating.
-
Automate step 3
Trying to fix this on the Magento code level is admirable, but you'll be swimming upstream. Sometimes it's better to accept that "That's just Magento being Magento", and to solve the problem with and external process.
I hope this helps, if you have further questions, then post a response, I will be happy to answer.
Regards,
Vijay
-
-
I'm not sure the answers previously presented are related to the issues you're having. Having worked with Magento for a long time, this can be an issue that occurs over and over again.
To answer your initial question, truncating your core_url_rewrite table will remove all of these URLs, but it'll only delay the problem until it reoccurs again in the future (unless you've had a problem in the past which has been rectified). You're also correct in that any rewrites in the system previously there will disappear, so you'll probably end up with a lot of crawl issues appearing in Search Console.
Your best move would be to find out why you have so many URLs in there in the first place. Do you have a huge product catalog with multiple stores? Or is this something to do with an issue in your Magento version or some setup issues. The most common time this usually occurs is if two products get added to your site with the same URL Key. Every time the reindex process runs, your core_url_rewrite table will grow. You could check this by looking at the number of rows in the table, reindexing the site and if it grows further, then it's likely to be the problem. The quickest way to fix this is to ensure all URL key are unique.
There's also an article here about duplicate keys - https://firebearstudio.com/blog/magento-url-reindex-core_url_rewrite-duplicates-patch.html - this should hopefully clear the issue.
I hope this helps! If it doesn't solve the problem, then sending over a little more information around the number of stores, catalog site and the split between system generated URL rewrites and custom URL rewrites would be great so we can try to help further!
Thanks,
Lewis -
This is an issue to to set-up. When you set up multiple ecommerce websites on Magento as 'Stores', then all SKUs will load on other domains. if they were set-up as 'Websites' then this would alleviate the issue. However, with Stores you are able to share shopping carts (i.e. Add a product from website A and checkout on website B).
What I did was turn off the XML cron jobs and set-up cross-domain canonicals. Also make sure your session IDs (/?SID=) are working properly. Not sure if this solves the technical issues, but should help clear up dupe content.
-
Is it creating a new url for each option (size, color, etc) as well as what page it shows up on or other various sort orders (by price, by size, etc.) and session id's that you could exclude? Are you sure they are truly duplicates?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
Robots.txt - blocking JavaScript and CSS, best practice for Magento
Hi Mozzers, I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento. I'm concerned we are blocking bots from crawling essential information for page rank. My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not? You can view our robots.txt file here Thanks, Blake
Intermediate & Advanced SEO | | LeapOfBelief0 -
Weird indexing problem - Can it be solved?
Hi Been building and optimising sites for 15 years and this is one of the hardest problems I ever came across. So any help would be very much appreciated. Here we go: For some mysterious reason this URL http://weekend.visitsweden.com/no/ has been indexed as http://weekend.visitsweden.com even if we tried all we can to correct it. The problem is that since the latter points to the first URL with a 301 it refuses to get any page rank. Also it does not get visible in Google at all. Just a recap of what we have tried so far: Add site to webmaster tools Add proper sitemap.xml Add 301 redirect to the correct URL An easy way to locate the problem is to search for the main content of the site. As you can see it returns the wrong URL and the correct URL does not even get listed. Again, any help is very much appreciated. Kind regards Fredrik
Intermediate & Advanced SEO | | Resultify0 -
Title tags with >70 characters but most important words at start. Is this really a problem?
Is there in fact any kind of negative impact having title tags longer than 70 characters, as long as I place the most important keywords at the start and make sure that title still is compelling when cut somewhere around 70 characters? Are the additional words after the 70 characters limit just ignored? May additional words dillute the strength of the first words or may they even be helpful ? Any experience or any studies you know about impact of longer title tags? Or any statement from google about it?
Intermediate & Advanced SEO | | lcourse0 -
Time sensitive: HELP! We are having a problem doing a 301 redirect.....what can we do instead?
Our website has dynamic URLs and we are moving to another server/platform. 301 redirects is looking like a highly unlikely solution. A 3rd party company is handling the back-end of the website which they say works more like a "search engine" than a traditional website. Maybe that explains why they're having a hard time with the 301 redirects. Worst case scenario: we can't use the 301 redirect. What else can we do? We are considering "Indicate your canonical (preferred) URLs by including them in a Sitemap" as Google describes here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139066#2. I'm wondering if this method only applies to duplicate content........and what would happen once the old website results in a 404 page...... HELP! We need to cross over to the new platform as soon as possible.
Intermediate & Advanced SEO | | PatriotOutfitters810 -
Could ranking problem be caused by Parked Domain?
I've been investigating a serious Google ranking drop for a small website in the UK. They used to rank top 5 for about 10 main keywords and overnight on 24/3/12 they lost rankings. They have not ranked in top100 since. Their pages are still indexed and they can still be found for their brand/domain name so they have not been removed completely. I've coverered all the normal issues you would expect to look for and no serious errors exist that would lead to what in effect looks like a penalty. The investigation has led to a an issue about their domain registration setup. The whois record (at domaintools) shows the status as "Registered and Parked or Redirected" which seems a bit unusual. Checking the registration details they had DNS settings pointing correctly to the webhost but also had web forwarding to the domain registrar's standard parked domain page. The domain registrar has suggested that this duplication could have caused ranking problems. What do you think? Is this a realistic reason for their ranking loss? Thanks
Intermediate & Advanced SEO | | bjalc20110 -
Duplicate description problem in Wordpress.
Webmaster tools is flagging up duplicate descriptions for the page http://www.musicliveuk.com/live-acts. The page is one page in the wordpress page editor and the web designer set it up so that I can add new live acts from a seperate page editor on the left menu and that feeds into the page 'live-acts'. (it says under template 'live-acts-feed'. The problem is as I add more acts it creates new url's eg http://www.musicliveuk.com/live-acts/page/2 and http://www.musicliveuk.com/live-acts/page/3 etc... I use the all in one SEO pack and webmaster tools tells me that page 2/3/4/ etc all have the same description. How can I overcome this? I can't write new descriptions for each page as the all in one SEO pack will only allow me to enter one for the page 'live-acts'.
Intermediate & Advanced SEO | | SamCUK0 -
Think I may have found a problem with site. Can you confirm my suspicions?
So I've been wracking my brain about a problem. I had posted earlier about our degrading rank that we haven't been able to arrest. I thought we were doing everything right. Many years ago we had a program that would allow other stores in our niche use our site as a storefront if they couldn't deal with setting up their own site. They would have their own homepage with their own domain but all links from that page would go to our site to avoid duplicate content issues (before I knew about canonical meta tags or before they existed, I don't remember). I just realize that we had dozens of these domains pointing to our site without nofollow meta tags. Is it possible that this pattern looked like we were trying to game Google and have been penalized as some kind of link farm since Panda? I've added nofollow meta tags to these domains. If we were being penalized for this, should this fix the problem?
Intermediate & Advanced SEO | | IanTheScot0