Way to spider Wordpress site
-
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.
I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.
I tried using these spidering programs: WinHTTack Website Copier and PageNest
Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
-
Hi Dan
Hmm that's a little strange. Two things;
- is WordPress updated? Do you get the normal URLs when viewing in your browser?
- have you tried Screaming Frog SEO Spider? It's free to crawl up to 500 pages Although it won't get the actual HTML on the pages, it could solve the URL issue perhaps.
This blackhat world thread has a few options too.
-Dan
-
Hi Dan, I'm not so experienced in migrating a WP to non -wp but I understand that the issue you're having is that the spider is returning index.htmlfiles for urls like domain/page/.
IT's normal, any spider you will use you'll always have and index.html file. Every directory has it's index.html which is the default file to show if you're not establishing something different with rewrite rules.
If you write /page/ the browser will read the index.html file. What you have to be sure is that you'll set up a 301 redirect to avoid any index.html url to show and have it redirected to the main / page (with wildcards is a one line rule) and that your internal links are pointing all to / pages and not to index.html version of it. You can jsut find and replace the /index.html" string into the html code with the /" text (dreamweaver or any html editor will do that in bulk.
Only one commentary on you idea is that you may consider useful to build a php driven site, using includes for header, footer and nav/sidebar, jsut because thinking ahead if you're willing to make changes to a portion of the page repeating throughout the site you'll have to make changes in all pages and uplaod them all which is quite huge to do and also let space for many human/machine errors.
Hope that helped you out!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know the linking of hashtags on Wix sites does it negatively or postively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please?
Does anyone know the linking of hashtags on Wix sites does it negatively or positively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please? For example at the bottom of this blog post https://www.poppyandperle.com/post/face-painting-a-global-language the hashtags are linked, but they don't go to a page, they go to search results of all other blogs using that hashtag. Seems a bit of a strange approach to me.
Technical SEO | | Mediaholix0 -
On our site by mistake some wrong links were entered and google crawled them. We have fixed those links. But they still show up in Not Found Errors. Should we just mark them as fixed? Or what is the best way to deal with them?
Some parameter was not sent. So the link was read as : null/city, null/country instead cityname/city
Technical SEO | | Lybrate06060 -
Wordpress Hatom problem
Hi, in Webmaster Tools i receive the following warnings: hatom-feedhatom-entry:Warning: At least one field must be set for HatomEntry.Warning: Missing required field "entry-title".Warning: Missing required field "updated".Warning: Missing required hCard "author".I googled a few strategies how to solve this problem but is it for SEO purpose really necessary to edit Theme core code to satisfy google's warnings?
Technical SEO | | reisefm0 -
If you are organizing the site structure for an ecommerce site, how would you do it?
Should you use not use slashes and use all dashes or use just a few slashes and the rest with dashes? For example, domain.com/category/brand/product-color-etc OR domain.com/anythinghere-color-dimensions-etc Which structure would you rather go for and why?
Technical SEO | | Zookeeper0 -
Not sure which way to go or what to do?
Hi there, I have been a pro member of SEOmoz for a while now but this is my question in the forum and although I have looked through so much helpful information I was wondering if someone could give me some further advice and guidance? I have a 3 year old ecommerce website personalisedmugs.co.uk which until May 2012 had some excellent growth, we then lost around 50% of traffic due to reduced organic rankings in google. We then noticed a further drop again in September. From researching information I believe this drop was from the penguin update and EMD update? Since these updates we have: *Stopped working with a company in India whom was looking after SEO for us for 18 months redeveloped/designed website and upgraded software version constantly refreshed website with content as we always have done Modified internal anchor text (this did seem keyword rich) My next steps I believe before giving up 😞 is checking our links coming into website? Is anybody able to please help me with regards to our links or point me in the right direction. I have no idea where to start or what do now? Someone may see something really obvious so any help or guidance is greatly appreciated to assist me in gaining some UK organic rankings back. Kind Regards, Mark
Technical SEO | | SparkyMarky0 -
Internal Links on eCommerce sites
I have been working on an eCommerce site; www.pretavoir.co.uk over the past year. Improvements in SERPs have been good with many top three positions. However, there are other important keywords of similar difficulty which refuse to behave in a similar way.... The site is PR4 and has a homepage PA 52. The homepage includes links to internal brand pages eg Prada, Gucci etc. Q Would it be worthwhile creating footer anchor text with eaxct text eg Prada sunglasses, Gucci Sunglasses?? Thanks
Technical SEO | | seanmccauley0 -
Site Disappeared off of Search
A friend of mine has a site (http://bit.ly/q4iWkM ) that was ranking number one for their key word (Drimnagh() and has now completely disappeared off of the ranking. I did some checking and can't see a problem. She does have duplicate meta and titles throughout but this shouldn't be a punishable offence that I know of and is something that I am going to correct with a quick plugin install. I couldn't see any redirects or code stopping search either. When you do site:URL it shows up OK as well. She is client of mine (for website not for SEO) and she is really upset about it so any help from the forum would be appreciated. This isn't even a site I did but you couldn't get a better person to work with so I am eager to help where and if possible. Guinness all round if someone solves it next time you are in Ireland
Technical SEO | | kdaly1000 -
Too many links on my site
Hi there everybody, I am a total SEO newbie and i am burning with questions. I had my site crawled and found out that it contains too many links. The reason is that it is a site where I constantly write news and articles and each one of them is a new Joomla item, thus a new link. I actually thought lots of content is good for SEO. How am I supposed to reduce the link amount?
Technical SEO | | polyniki0