Export Website into XML File
-
Hi,
I am having an agency optimize the content on my sites. I need to create XML Schema before I export the content into XML.
What is best way to export content including meta tags for an entire site along with the steps on how to?
-
I don't know if it does anything more than an offline copy. I haven't encountered your use case before, so haven't looked for that. You might look to see if that program or others has those types of options that could help you.
-
will this software be able to export the site in xml or bascially just a offline copy?
-
I've used http://www.httrack.com/ HTTrack Website Copier before. Website copy software is one keyword search to get you started to find tools like this.
-
That would probably work, keri. What are the tools you speak of?
-
There are tools that will crawl and scrape your entire site and make a local copy of it. Would that work as something you could hand off to the agency?
-
i want a copy of the site content (on-page content and meta data) to give to an agency to optimize. its a regular site hosted on apache server
-
Are you talking about a Wordpress Blog ? What are you trying to do by exporting site content/meta data into an XML File ? Are you trying to use it as a backup or what ?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Target Country Specific Website Traffic?
I have a website with .com domain but I need to generate traffic from UK? I have already set my GEO Targeting location as UK in Google Webmasters & set country location as UK in Google Analytics as well but still, i get traffic only from India. I have also set Geo-targeting code at the backend of the website. But nothing seems works. Can anyone help me how can is do this? I am unable to understand what else can be done.
Intermediate & Advanced SEO | | seoninj0 -
Sale Pages On An eCommerce Website
I have a client who sells 50 brands of shoes. At the moment the developer has a noindex/nofollow tag on all sale pages which is wrong as around 10% of site activity revolves around those pages. The structure looks like this: 1. For Cats/Sub Cats site/sale
Intermediate & Advanced SEO | | Nigel_Carr
site/womens/sale
site/womens/shoe/sale
site/womens/shoes/ballerinas/sale For every cat/subcat - there are 10 cats and average 5 subcats per cat so 50 pages of sale. 2. For Brands site/brand
site/brand/womens
site/sale/brand
site/sale/womens/brand
site/sale/womens/cat/brand
site/sale/womens/cat/subcat/brand So each brand can have four sale pages on top of its own brand page. 50 brands x 54 = around 2700. Now no one is going to start writing 2700 pieces of additional on page content (although Meta is OK! ) and we risk further diluting the brand pages we need to show highly for, so we need to do something. Should we Category Pages: 1. Allow all sale cat and subcat pages to proliferate through Google? or
2. Canonicalise all sale sub category pages back to category
3. Caonicalise all category and Subcategory pages back to sale/womens Brand Pages: 1. Allow all sale brand pages to proliferate through Google ?
2. Canonicalise Sub Cat brand pages back to sale/category/brand
3. Canonicalise Sub Cat and Cat back to sale/brand Note the lower pages never do well in search. If you search a brand + Sale in Google it is always the site/brand page that comes up, never the sale version (This is from research on other similar sites and my own analysis) Same with Sub Cats - eg, Brand + Subcat - it's always site/brand that comes up first wand has the highest PA. Also we can't analyse any of these sale pages in MOZ or anywhere else as they are not in search at all having been no indexed. That's my conundrum for today, Any thoughts would be appreciated!0 -
New websites
Hi Moz community, My company updated and used a new developer to build and re-design their charity websites: www.runforcharity.com, www.cycleforcharity.com and www.sportforcharity.com. This sites were "re-launched" at the beggining of December 2015 and I have now been able to get a good 6 weeks worth of data. I've been religiously using Moz.com for a couple of years and I use it simply for SEO purposes. Our websites are built upon organic traffic being driven to them and I have noticed that the PA on the new sites has taken a hammering. They all appear to have a PA of 1 and I'm at a loss why? It appears that no page has h1 text? Would this be an issue with the developer or something the content team is doing wrong? Any help of advice would be much appreciated. Many thanks Ryan
Intermediate & Advanced SEO | | Bennerya0 -
Bing seriously hitting our website
Hi I have a strange query, Bing in the last four weeks have been seriously crawling our site, and while at the moment this isn't affecting our servers, coming into the busy time of the year I am just wondering if anyone else is seeing this. They are basically crawling the entire site (including nofollow pages) even though according to Bing Webmaster tools, they know our sitemap. Is anybody else seeing any unusual activity with the Bing bot. Thanks Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Google Not Indexing XML Sitemap Images
Hi Mozzers, We are having an issue with our XML sitemap images not being indexed. The site has over 39,000 pages and 17,500 images submitted in GWT. If you take a look at the attached screenshot, 'GWT Images - Not Indexed', you can see that the majority of the pages are being indexed - but none of the images are. The first thing you should know about the images is that they are hosted on a content delivery network (CDN), rather than on the site itself. However, Google advice suggests hosting on a CDN is fine - see second screenshot, 'Google CDN Advice'. That advice says to either (i) ensure the hosting site is verified in GWT or (ii) submit in robots.txt. As we can't verify the hosting site in GWT, we had opted to submit via robots.txt. There are 3 sitemap indexes: 1) http://www.greenplantswap.co.uk/sitemap_index.xml, 2) http://www.greenplantswap.co.uk/sitemap/plant_genera/listings.xml and 3) http://www.greenplantswap.co.uk/sitemap/plant_genera/plants.xml. Each sitemap index is split up into often hundreds or thousands of smaller XML sitemaps. This is necessary due to the size of the site and how we have decided to pull URLs in. Essentially, if we did it another way, it may have involved some of the sitemaps being massive and thus taking upwards of a minute to load. To give you an idea of what is being submitted to Google in one of the sitemaps, please see view-source:http://www.greenplantswap.co.uk/sitemap/plant_genera/4/listings.xml?page=1. Originally, the images were SSL, so we decided to reverted to non-SSL URLs as that was an easy change. But over a week later, that seems to have had no impact. The image URLs are ugly... but should this prevent them from being indexed? The strange thing is that a very small number of images have been indexed - see http://goo.gl/P8GMn. I don't know if this is an anomaly or whether it suggests no issue with how the images have been set up - thus, there may be another issue. Sorry for the long message but I would be extremely grateful for any insight into this. I have tried to offer as much information as I can, however please do let me know if this is not enough. Thank you for taking the time to read and help. Regards, Mark Oz6HzKO rYD3ICZ
Intermediate & Advanced SEO | | edlondon0 -
Blogs and E-Commerce websites
I have recently launched an e-commerce website which has a whopping domain authority of 1! I was thinking about adding a blog to it (it's in open cart), but that would mean creating it in a wordpress but using the same domain name. Would this be beneficial from an SEO stand point (i.e sending traffic to w blog that isn't actually on the e-commerce website itself) , or am I better off creating content as blogs/articles on other people sites?
Intermediate & Advanced SEO | | lindsayjhopkins0 -
News section of the website (Duplicate Content)
Hi Mozers One of our client wanted to add a NEWS section in to their website. Where they want to share the latest industry news from other news websites. I tried my maximum to understand them about the duplicate content issues. But they want it badly What I am planning is to add rel=canonical from each single news post to the main source websites ie, What you guys think? Does that affect us in any ways?
Intermediate & Advanced SEO | | riyas_heych0 -
Regional websites
Hi, I run 4 websites London, New York, Singapore and Dubai. Same company but some of our products are different in each region. Each domain is registered in the relevant region and I have google webmaster tools set so they know the location of each website. The problem is that our Dubai and US websites are appearing higher that the UK website in google.co.uk organic. Does anyone have any ideas why? Thanks
Intermediate & Advanced SEO | | markc-1971830