Joomla to Wordpress site migration - thousands of 404s
-
I recently migrated a site from Joomla to Wordpress. In advance I exported the HTML pages from Joomla using Screaming Frog and did 301 redirects on all those pages.
However Webmaster Tools is now telling me (a week after putting the redirects in place) that there are >7k 404s. Many of them aren't HTML pages, just index.php files but I didn't think I would have to export these in my Screaming Frog crawl.
We have since done a blanket 301 redirect for anything with index.php in it but Webmaster Tools is still picking them up as 404s.
So my question is, what should I have done with Screaming Frog re exporting to ensure I captured all pages to redirect and what should I now do to fix the 404s that Webmaster Tools is picking up?
-
Hi There
Generally those types of 404's won't be too harmful - they sound like they may have been somewhat artificial WordPress pages.
What I would do is get your list now from Analytics or Webmaster Tools - this way you will capture URLs that actually got traffic or Impression in Google and redirect those.
So run a landing pages report, and an top pages report in webmaster tools - maybe for the last 6 months. Create a text file of all the URLs, and run them in list mode through Screaming Frog. Redirect any that 404.
If you were to go back in time, what I would have done with Screaming Frog is - let it crawl everything - you have to allow it to "follow redirects" and "ignore robots.txt" etc - I know Google is not supposed to crawl anything in robots.txt - but basically you'd be letting Screaming Frog get to everything, that way you don't miss any URLs.
-
I know it doesn't create redirects but I wanted to use it to figure out the list of files / pages to create 301 redirects for and then add these to the HTAccess file. However was I incorrect to just export the HTML files from Screaming Frog as there were only 500 of these but there are now 7000 404s in Webmaster Tools of PHP files.
-
Hi,
Screaming frog doesn't create redirects. You need to use a mod_redirect or something similar.
Maybe, the best option for your problem it's creating a database of old pages -> new pages, and redirect all connections for unknown pages to these page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to jumpstart a new Ecommerce site
Hello, I've got a new Ecommerce site I'm jumpstarting. It's one of those sites that takes a while to rank for. Here's what we're doing: 1. Creating a beautiful, mobile friendly site. 2. Adding a long detailed home page answering all the questions that people come to our industry keyword results with. 3. Adding detailed, beautiful cateogy pages. 4. Adding detailed, beautiful product pages. 5. Adding beautiful, long About Us & Resource Sites list pages. 6. Offering straight up obvious free shipping and no tax even though that's taking a hit in our industry. 7. We're going after the 2 main informational terms (keyword explorer) in the industry with a vengance - 20X as good as the competition for the main term. 8. We're adding 20-30 pages of articles to help our customers and hit major keyword search terms, although there's not much in our industry. What else would you recommend doing to jumpstart a new Ecommerce site that has difficulty being in the top 50? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
Site been plagiarised - duplicate content
Hi, I look after two websites, one sells commercial mortgages the other sells residential mortgages. We recently redesigned both sites, and one was moved to a new domain name as we rebranded it from being a trading style of the other brand to being a brand in its own right. I have recently discovered that one of my most important pages on the residential mortgages site is not in Google's index. I did a bit of poking around with Copyscape and found another broker has copied our page almost word-for-word. I then used copyscape to find all the other instances of plagiarism on the other broker's site and there are a few! It now looks like they have copied pages from our commercial mortgages site as well. I think the reason our page has been removed from the index is that we relaunced both these sites with new navigation and consequently new urls. Can anyone back me up on this theory? I am 100% sure that our page is the original version because we write everything in-house and I check it with copyscape before it gets published, Also the fact that this other broker has copied from several different sites corroborates this view. Our legal team has written two letters (not sent yet) - one to the broker and the other to the broker's web designer. These letters ask the recipient to remove the copied content within 14 days. If they do remove our content from our site, how do I get Google to reindex our pages, given that Google thinks OUR pages are the copied ones and not the other way around? Does anyone have any experience with this? Or, will it just happen automatically? I have no experience of this scenario! In the past, where I've found duplicate content like this, I've just rewritten the page, and chalked it up to experience but I don't really want to in this case because, frankly, the copy on these pages is really good! And, I don't think it's fair that someone else could potentially be getting customers that were persuaded by OUR copy. Any advice would be greatly appreciated. Thanks, Amelia
Intermediate & Advanced SEO | | CommT0 -
Micro Site Penalty?
I have been carrying out On-Page optimisation only for a client www.shade7.co.nz. After three months or so I have been getting some great results, improving to the top three positions for at least 30 of 45 keywords targeted. Couple of more tweaks and I would be a very happy camper. Disaster overnight! Rankings CRASH! Unbeknown to me the client a month or so back decided to link just about every product/link on a micro site he owns (www.shademakers.com/ ) plus one other site he owns. Explorer I think discovered over 350 back-links (follow) from these sites! As this is a site he owns and it is targeting the same keywords I presume this falls into the EVIL bucket of SEO. Two part question do you believe I am correct that this is the reason for this rankings crash and what would be the best way to resolve this! server-side 301 redirect for the micro site? Delete the micro site (drastic measure) Remove all the links other than maybe one in the contact page saying visit our other site shade7 other options? The client or I have not received any bad link Emails from Google.
Intermediate & Advanced SEO | | Moving-Web-SEO-Auckland0 -
Site structure from an SEO standpoint
I am fortunate enough to be working with a client who is still building their website. From a site structure standpoint, what can I look for with my SEO hat as they build their wire frames and storyboard their site? I want to make sure I don't miss any components that might be helpful short and long term
Intermediate & Advanced SEO | | StreetwiseReports0 -
Niche sites: how to optimize them?
Dear SEOmozzers, I am focusing on those keywords I find using the "Keyword Analysis" tool that are not too competitive (among 20%-30% competitiveness). I then buy domains that include those keywords. The sites are in Italian and targeting the Italian search engines, so the competition is lower than it would be in the US. Basically, I'd like to build niche sites and I'd like to ask a few questions that I hope somebody with a good experience in this field can answer: How do I optimize a niche site. Specifically, how do I go about link building? How many backlinks should I get to see some results? How long does it take for a typical niche site to start appearing in the search engines for a certain keyword, after launching an effective link building campaign? Please kindly provide any recommendations you believe to be important when building niche sites. For example, is there a company/professional you know that specializes in this field and who is trustworthy/reliable? Thank you very much for your help. All best, Sal
Intermediate & Advanced SEO | | salvyy0 -
Does Google punish sites for Backlinks?
Here is Matt Cutts video, for those of you who have not seen it already. http://www.youtube.com/watch?v=f4dAWb5jUws (Very Short) In this Video Matt explains that Google does not look at backlinks. Many link spamming sites have detected, there have been many website receiving warning messages in their Google web tools to deindex these links, etc.. My theory is that Google will not punish sites for backlinks. However, they manually check for "link farming sites" and warn anyone affiliated with them, just in case these links were built from a competitor. This way they can eliminate all the "Bad Link Farm" sites and not hurt anyone who does not deserve to be hurt. Google is not going to give us all their information to rank, they dont want us to rank. They want us to PPC. However, they do want to have the best SERPs available. I call it Google juggling! Thoughts?
Intermediate & Advanced SEO | | SEODinosaur0 -
My site links have gone from a mega site links to several small links under my SERP results in Google. Any ideas why?
A site I have currently had the mega site links on the SERP results. Recently they have updated the mega links to the smaller 4 inline links under my SERP result. Any idea what happened or how do I correct this?
Intermediate & Advanced SEO | | POSSIBLE0 -
How to see which site Google views as a scraper site?
If we have content on our site that is found on another site, what is the best way to know which site Google views as the original source? If you search for a line of the content such as "xyz abc etc" and the other site shows before yours in search results, does that mean that Google views that site as the original source?
Intermediate & Advanced SEO | | nicole.healthline0