Way to spider Wordpress site
-
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.
I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.
I tried using these spidering programs: WinHTTack Website Copier and PageNest
Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
-
Hi Dan
Hmm that's a little strange. Two things;
- is WordPress updated? Do you get the normal URLs when viewing in your browser?
- have you tried Screaming Frog SEO Spider? It's free to crawl up to 500 pages Although it won't get the actual HTML on the pages, it could solve the URL issue perhaps.
This blackhat world thread has a few options too.
-Dan
-
Hi Dan, I'm not so experienced in migrating a WP to non -wp but I understand that the issue you're having is that the spider is returning index.htmlfiles for urls like domain/page/.
IT's normal, any spider you will use you'll always have and index.html file. Every directory has it's index.html which is the default file to show if you're not establishing something different with rewrite rules.
If you write /page/ the browser will read the index.html file. What you have to be sure is that you'll set up a 301 redirect to avoid any index.html url to show and have it redirected to the main / page (with wildcards is a one line rule) and that your internal links are pointing all to / pages and not to index.html version of it. You can jsut find and replace the /index.html" string into the html code with the /" text (dreamweaver or any html editor will do that in bulk.
Only one commentary on you idea is that you may consider useful to build a php driven site, using includes for header, footer and nav/sidebar, jsut because thinking ahead if you're willing to make changes to a portion of the page repeating throughout the site you'll have to make changes in all pages and uplaod them all which is quite huge to do and also let space for many human/machine errors.
Hope that helped you out!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I noticed all my SEOed sites are getting attacked constantly by viruses. I do wordpress sites. Does anyone have a good recommendation to protect my clients sites? thanks
We have tried all different kinds of security plugins but none seem to work long term.
Technical SEO | | Carla_Dawson0 -
Tag archives in wordpress
I have duplicate content issue on my site, because i allow to index tags in my wordpress. And the content overlaps on them. What could be a solution to this? How do i fight it, if still want my tag pages to be indexed in Google, but i don't want to to influence my traffic negatively? Currently i have 596 tags! 🙂 Site:
Technical SEO | | pycckuu
richclubgirl.com My idea was to put canonical tag for the post i want to rank from the most popular tag pages (with biggest page authority). Would love to hear from You!1 -
Webmaster Tools Links To Your Site
I logged onto webmaster tools today for my site and the section 'Links to Your Site' is showing no data. Also if I search using link:babskibaby.com it only shows 1 link. My site had been showing 500+ links previously. Does anyone know why this is?
Technical SEO | | babski0 -
Site offline - Mitigating measures?
Hi, Our domain has expired, and it could take up to 48h to recover our website. Appart from the obvious image damage, It worries me Google will just think we have vanisheg Any recommendations? Maybe update something on WebMasterTools? Not having the domain, cannot even do any temporary redirect, etc... Thanks! Jaime
Technical SEO | | BaseKit0 -
301 redirecting old content from one site to updated content on a different site
I have a client with two websites. Here are some details, sorry I can't be more specific! Their older site -- specific to one product -- has a very high DA and about 75K visits per month, 80% of which comes from search engines. Their newer site -- focused generally on the brand -- is their top priority. The content here is much better. The vast majority of visits are from referrals (mainly social channels and an email newsletter) and direct traffic. Search traffic is relatively low though. I really want to boost search traffic to site #2. And I'd like to piggy back off some of the search traffic from site #1. Here's my question: If a particular article on site #1 (that ranks very well) needs to be updated, what's the risk/reward of updating the content on site #2 instead and 301 redirecting the original post to the newer post on site #2? Part 2: There are dozens of posts on site #1 that can be improved and updated. Is there an extra risk (or diminishing returns) associated with doing this across many posts? Hope this makes sense. Thanks for your help!
Technical SEO | | djreich0 -
Does it matter if I leave image links pointing to old site when I move a wordpress blog?
Hi everyone I am moving a blog from one site to another. I have all the 301 redirects etc under control, but my question has to do with image links in the blogs. The image links all point over to the old site once the posts are copied over. Is this a major problem from an SEO perspective? Lots of links pointing out to an old site? It won't matter from the users perspective as I have 'none' for the image URL, so the user will never know. I will reload all the images if necessary but boy that will be a lot of work. Or is there a shortcut? Thanks very much Wendy
Technical SEO | | Chammy0 -
Site Hosting Question
We are UK based web designers who have recently been asked to build a website for an Australian Charity. Normally we would host the website in the UK with our current hosting company, but as this is an Australian website with an .au domain I was wondering if it would be better to host it in Australia. If it is better to host it in Australia, I would appreciate if someone could give me the name of a reasonably priced hosting company. Thanks Fraser
Technical SEO | | fraserhannah0 -
404 erros on wordpress blog
Both SEOMOZ and Google webmaster tools report lots of 404 errors throughout my wordpress blog. I have the url structure set to category/title Most of the 404 errors seem to be that the crawler is looking for a /home.html page. Each time I add a new post I get more 404 errors. I could, of course, add 301 redirects but I presume there is an easy way to do this within the WP setup. Any ideas? Thanks
Technical SEO | | bjalc20110