URL mapping for site migration
-
Hi all! I'm currently working on a migration for a large e-commerce site. The old one has around 2.5k urls, the new one 7.5k. I now need to sort out the redirects from one to the other.
This is proving pretty tricky, as the URL structure has changed site wide. There doesn't seem to be any consistent rules either so using regex doesn't really work.
By and large, the copy appears to be the same though. Does anybody know of a tool I can crawl the sites with that will export the crawled url and related copy into a spreadsheet? That way I can crawl both sites and compare the copy to match them up.
Thanks!
-
Just to confirm mosquitohawk's comments, there's not a great way to do this other than sorting through the spreadsheet.
Hopefully URLs have distinct enough subfolders that you can break them out into sections easily.
-
Darn!
Another alternative would be to use Screaming Frog to get a full list of URLs from each site, then use a scraping tool like Mozenda to scrape that list from each site, pull the content area and it will create the data structure you want and make it available for export. Then you can basically do what I had said in the previous email, compare the two spreadsheets.
-
Thank you for taking the time to answer. I did think of Screaming Frog, but the problem is that it only records the instances of custom parameters, not the contents. I tweeted the SF team to check and they said it wasn't possible too. I've also tried InSite Inspyder too but tat doesn't do it either.
-
Screaming Frog SEO Spider could do that for you. You'd need to set up a custom filter to look for a copy identifier (ie: a div that always contains the main copy) and have it scrape that for you while it's crawling. Do the same for the other site and then you could match them up pretty easy I think.
Here is a good resource on different ways of using the tool - http://www.seerinteractive.com/blog/screaming-frog-guide We use it almost daily for a variety of tasks and find it to be pretty flexible. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My sites are not mooving why?
i have three local sites in Dubai. my second site is on page three. i didn't go for any guest post yet but for a long time with all improvement, It didn't move a bit. unable to understand the adhesivity of page three. lol any suggestion site 1- https://www.desertsafaritour.ae site 2- https://www.arabiannightsafari.com site3- https://www.uaedesertsafari.com any expert suggestion or any guideline by moz expert www.desertsafaritour.ae
Intermediate & Advanced SEO | | faisalkiani0 -
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
Ajax tabs on site
Hello, On a webpage I have multiple tabs, each with their own specific content. Now these AJAX/JS tabs, if Google only finds the first tab when the page loads the content would be too thin. What do you suggest as an implementation? With Google being able to crawl and render more JS nowadays, but they deprecated AJAX crawling a while back. I was maybe thinking of doing a following implementation where when JS is disabled, the tabs collapse under each other with the content showing. With JS enabled then they render as tabs. This is usually quite a common implementation for tabbed content plugins on Wordpress as well. Also, Google had commented about that hidden/expandable content would count much less, even with the above JS fix. Look forward to your thoughts on this. Thanks, Conrad
Intermediate & Advanced SEO | | conalt1 -
Adding hreflang tags - better on each page, or the site map?
Hello, I am wondering if there seems to be a preference for adding hreflang tags (from this article). My client just changed their site from gTLDs to ccTLDs, and a few sites have taken a pretty big traffic hit. One issue is definitely the amount of redirects to the page, but I am also going to work with the developer to add hreflang tags. My question is - is it better to add them to the header of each page, or the site map, or both, or something else? Any other thoughts are appreciated. Our Australia site, which was at least findable using Australia Google before this relaunch, is not showing up, even when you search the company name directly. Thanks!Lauryn
Intermediate & Advanced SEO | | john_marketade0 -
Linking to URLs With Hash (#) in Them
How does link juice flow when linking to URLs with the hash tag in them? If I link to this page, which generates a pop-over on my homepage that gives info about my special offer, where will the link juice go to? homepage.com/#specialoffer Will the link juice go to the homepage? Will it go nowhere? Will it go to the hash URL above? I'd like to publish an annual/evergreen sort of offer that will generate lots of links. And instead of driving those links to homepage.com/offer, I was hoping to get that link juice to flow to the homepage, or maybe even a product page, instead. And just updating the pop over information each year as the offer changes. I've seen competitors do it this way but wanted to see what the community here things in terms of linking to URLs with the hash tag in them. Can also be a use case for using hash tags in URLs for tracking purposes maybe?
Intermediate & Advanced SEO | | MiguelSalcido0 -
Bad site migration - what to do!
Hi Mozzers - I'm just looking at a site which has been damaged by a very poor site migration. Basically, the old URLs were 301'd to a page on the new website (not a 404) telling everyone the page no longer existed. They did not 301 old pages to equivalent new pages. So I just checked Google WMT and saw 1,000 crawl errors - basically the old URLs. This migration was done back in February, since when traffic to the website has never recovered. Should I fix this now? Is it worth implementing the correct 301s now, after such a timelapse?
Intermediate & Advanced SEO | | McTaggart0 -
Hash URLs
Hi Mozzers, Happy Friday! I have a client that has created some really nice pages from their old content and we want to redirect the old ones to the new pages. The way the web developers have built these new pages is to use hashbang url's for example www.website.co.uk/product#newpage My question is can I redirect urls to these kind of pages? Would it be using the .htaccess file to do it? Thanks in advance, Karl
Intermediate & Advanced SEO | | KarlBantleman0 -
Migrating a site from a standalone site to a subdivision of large .gov.uk site
The scenario We’ve been asked by a client, a Non-Government Organisation who are being absorbed by a larger government ministry, for help with the SEO of their site. They will be going from a reasonably large standalone site to a small sub-directory on a high authority government site and they want some input on how best to maintain their rankings. They will be going from the Number 1 ranked site in their niche (current site domainRank 59) to being a sub directory on a domainRank 100 site). The current site will remain, but as a members only resource, behind a paywall. I’ve been checking to see the impact that it had on a related site, but that one has put a catch all 302 redirect on it’s pages so is losing the benefit of a it’s historical authority. My thoughts Robust 301 redirect set up to pass as much benefit as possible to the new pages. Focus on rewriting content to promote most effective keywords – would suggest testing of titles, meta descriptions etc but not sure how often they will be able to edit the new site. ‘We have moved’ messaging going out to webmasters of existing linking sites to try to encourage as much revision of linking as possible. Development of link-bait to try and get the new pages seen. Am I going about this the right way? Thanks in advance. Phil
Intermediate & Advanced SEO | | smrs-digital0