URL mapping for site migration
-
Hi all! I'm currently working on a migration for a large e-commerce site. The old one has around 2.5k urls, the new one 7.5k. I now need to sort out the redirects from one to the other.
This is proving pretty tricky, as the URL structure has changed site wide. There doesn't seem to be any consistent rules either so using regex doesn't really work.
By and large, the copy appears to be the same though. Does anybody know of a tool I can crawl the sites with that will export the crawled url and related copy into a spreadsheet? That way I can crawl both sites and compare the copy to match them up.
Thanks!
-
Just to confirm mosquitohawk's comments, there's not a great way to do this other than sorting through the spreadsheet.
Hopefully URLs have distinct enough subfolders that you can break them out into sections easily.
-
Darn!
Another alternative would be to use Screaming Frog to get a full list of URLs from each site, then use a scraping tool like Mozenda to scrape that list from each site, pull the content area and it will create the data structure you want and make it available for export. Then you can basically do what I had said in the previous email, compare the two spreadsheets.
-
Thank you for taking the time to answer. I did think of Screaming Frog, but the problem is that it only records the instances of custom parameters, not the contents. I tweeted the SF team to check and they said it wasn't possible too. I've also tried InSite Inspyder too but tat doesn't do it either.
-
Screaming Frog SEO Spider could do that for you. You'd need to set up a custom filter to look for a copy identifier (ie: a div that always contains the main copy) and have it scrape that for you while it's crawling. Do the same for the other site and then you could match them up pretty easy I think.
Here is a good resource on different ways of using the tool - http://www.seerinteractive.com/blog/screaming-frog-guide We use it almost daily for a variety of tasks and find it to be pretty flexible. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Domain Migration Hell!
5 weeks ago we migrated our site to a new domain. We also installed an SSL certificate on the new domain. The new domain was purchased 5 years ago but we only used it as a redirect address. It was more consistent with our brand so we decided to migrate to it. Great care was taken setting up page to page redirects. A formal domain change request was made to Google. In fact the move was implemented with only a handful of broken links on a 500 page site. Those links were quickly fixed. Our traffic declined from about 350 visitors a week to as low as 40 visitors the first full week after the move. Now the number of organic Google visits is up to 80, a drop of 75% !!! All except 20 (out of 500) pages are reindexed on Google Search Console. MOZ domain authority for the new domain has climbed from 5 to about 12. The old domain had a DA of 23. In Google Search Console hundreds of "URL Not Allowed" errors are the site map for our previous domain that redirects to our new domain. Attached please see image of this. The site map for the new domain appears normal, but about 160 pages are indexed that are not in the sitemap. I wonder if these two issues have somehow contributed to the drop in ranking. I have included images showing GCT for the 2 domains. I posted on MOZ a month ago and was told it just might take time. No improvement and now I am wonder if there is not some issue with the sitemaps causing havoc. Are traffic is down more than 80%. This does not seem normal. Any advice? Any suggestions as to how to expedite recovery? Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
301 old site to new site?
I have client with an old site - www.bestfamilylawattorney.com - which had a lot of spammy links (and bad rankings). Instead of fixing those issues, we started a new URL - www.berenjifamilylaw.com - with new content and redesign. Should I do a 301 redirect from old to new domain? If the old site was being penalized, would a 301 transfer that penalty? I just want to make sure I don't end up hurting the new site after doing all the work to start fresh. Thanks.
Intermediate & Advanced SEO | | mrodriguez14400 -
What was your experience with changing site url's?
I work with a company that is about to move to a new platform. Because the category and page structure is different every almost every url but the home page will need to be 301 redirected. I know how to do this and am pretty sure I will find and fix 99% ahead of time and not have too many 404's showing up in webmaster tools to clean up. My question is has anyone who is reading this post had to do this before and what was your experience with organic traffic after you made the switch. I am predicting that even if I successfully redirected 100% of the url's there would be some loss for a couple of months just due to the fact that we are making a major change. My bosses are asking if there will be any loss and I need to tell them what to expect.
Intermediate & Advanced SEO | | KentH0 -
Spammy sites that link to a site
Hello, What is the best and quickest way to identify spammy sites that link to a website, and then remove them ( google disavow?) Thank you dear Moz, community - I appreciate your help 🙂 Sincerely, Vijay
Intermediate & Advanced SEO | | vijayvasu0 -
New site, new URL, lots of custom content. Load it all or "trickle" it over time?
New site, new URL, lots of custom content. Load it all or "trickle" it over time? Would it make a difference in terms of ranking the site? Interested in your thoughts. Thanks! BBuck!
Intermediate & Advanced SEO | | BBuck0 -
Urls in Bilingual websites
1-I have a bilingual website. Suppose that I am targeting a page for keyword "book" and I have included it in that page url for the English version: English version: www.abc.com/book Can I use the translation of "book" in the second language of the website url instead of "book" ? Please let me know which of the following urls are right " French Verison: www.abc.com/fr/book or www.abc.com/fr/livre livre=Book in French 2- Does Google have any tool to check if the second language page of the website has exactly the same content as the English version. What I want to do is for example for a certain page in English version, my targeted keyword is "book" . So my content would be around books. But in the French version of this page, I want to focus on keyword "Pencil" in French instead of "book". Is it wrong or any consequences? That was the main reason for the question number one. Because if it is ok to do what I explained in item 2 then I will set my urls like: In English : www.abc.com/book In French: www.abc.com/fr/crayon crayon=Pencil in French
Intermediate & Advanced SEO | | AlirezaHamidian0 -
Has my site been penalized?
Our site was listed on the first page for the phrase Active SEO on Google.co.uk. We suddenly find ourselves on page 4 overnight and we're not sure what's going on. We have not undertaken an Black hat techniques however the site is fairly new. Anyone have any ideas as to what is going on?
Intermediate & Advanced SEO | | MassivePrime0 -
SEO for Log in Sites
Hello, I just lunched a website where you have to sign up and to log in in order to use it. So I have the home, also a blog but then the rest of the pages are let's say it "hidden".How would you do the seo for it? I have been cheking facebook, foursquare and some others and they use different approaches. Facebook uses the same description in every single page for example. My site is similar to foursquare users have profile, stats, history, ranking. Well, what is your advice?? Thanks a lot
Intermediate & Advanced SEO | | antorome0