Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Export Website into XML File
-
Hi,
I am having an agency optimize the content on my sites. I need to create XML Schema before I export the content into XML.
What is best way to export content including meta tags for an entire site along with the steps on how to?
-
I don't know if it does anything more than an offline copy. I haven't encountered your use case before, so haven't looked for that. You might look to see if that program or others has those types of options that could help you.
-
will this software be able to export the site in xml or bascially just a offline copy?
-
I've used http://www.httrack.com/ HTTrack Website Copier before. Website copy software is one keyword search to get you started to find tools like this.
-
That would probably work, keri. What are the tools you speak of?
-
There are tools that will crawl and scrape your entire site and make a local copy of it. Would that work as something you could hand off to the agency?
-
i want a copy of the site content (on-page content and meta data) to give to an agency to optimize. its a regular site hosted on apache server
-
Are you talking about a Wordpress Blog ? What are you trying to do by exporting site content/meta data into an XML File ? Are you trying to use it as a backup or what ?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is penalized from google with no message in GWT.
On 26 of October 2018 My website have around 1 million pages indexed on google. but after hour when I checked my website was banned from google and all pages were removed. I checked my GWT and I did not receive any message. Can any one tell me what are the possible reasons and how can I recover my website? My website link is https://www.whoseno.com
Intermediate & Advanced SEO | | WhoseNo0 -
Website Snippet Update in Search Console?
I have a company that I started working with that has an outdated and inaccurate snippet coming up. See the link below. They changed their name from DK on Pittsburgh Sports to just DK Pittsburgh Sports several years ago, but the snippet is still putting the old info, including outdated and incorrect description. I'm not seeing that title or description anywhere on the site or a schema plugin. How can we get it updated? I have updated titles, etc. for the home page, and done a Fetch to get re-indexed. Does Snippet have a different type of refresh that I can submit or edit? Thanks in advance https://g.co/kgs/qZAnAC
Intermediate & Advanced SEO | | jeremyskillings0 -
How to integrate two websites, post-merger?
One of my clients has just been bought by a much larger company and thus will be losing their website and brand name. My client's site has built up a lot of traffic and authority in its space, so we are very nervous about losing all of this after the sale has gone through. The purchasing company intends for my client's services to be represented on its own website, so I am wondering, from a technical standpoint, what the best way is of going ahead with this, since my client will continue to work with the new company and would like to keep us onboard. Should we doing an 80/20 analysis, recreate our most valuable pages (eg. 70%+ of traffic is to home page) on the new site, then 301 each of these pages individually to its equivalent on the new site, while retaining as much of the old pages' on-page content/structure as possible? One thing I am concerned about is the fact that a large chunk of traffic is from brand searches. Again, should we simply recreate the home page with a page title of e.g. "X company is now part of Y company" in order that we'll still rank highly for the old company's brand name? Any advice on how to go about this is much appreciated.
Intermediate & Advanced SEO | | zakkyg0 -
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
How to find affiliate sites linking to a competitor website?
Hello here, I am trying to understand the best way to find sites that are affiliate of a competitor, through link research. Typically our competitor's affiliates link to our competitor website via any of the following links: http://www.musicnotes.com/sheetmusic/ard.asp?SID=[aff_id]&LID=[link_id] http://click.linksynergy.com/link?id=[aff+id]&offerid=[off_id]&type=2&murl=http%3A%2F%2Fwww.musicnotes.com%2Fsheetmusic%2Fmtd.asp%3Fppn%3D[item_id] The first link looks much easier to find, so I have tried to find the first kind of links with Google by using the "link:" clause as follows: link:http://www.musicnotes.com/sheetmusic/ard.asp Or, similarly, by using Open Site Explorer. But I always get 0 results! It is weird because I know there are thousands of affiliates out there with the same tracking code. How's that possible? Why does it look impossible to find the sites I am looking for? Would you suggest any different approach? Any ideas, suggestions and thoughts are very welcome! Thank you in advance. Fab.
Intermediate & Advanced SEO | | fablau0 -
Effects of having both http and https on my website
You are able to view our website as either http and https on all pages. For example: You can type "http://mywebsite.com/index.html" and the site will remain as http: as you navigate the site. You can also type "https://mywebsite.com/index.html" and the site will remain as https: as you navigate the site. My question is....if you can view the entire site using either http or https, is this being seen as duplicate content/pages? Does the same hold true with "www.mywebsite.com" and "mywebsite.com"? Thanks!
Intermediate & Advanced SEO | | rexjoec1 -
Redirecting Canonical 301s and Magento Website
I have an issue with a client's website where it has 3700+ pages, but roughly half of them are duplicates. Thankfully, the only difference between the original and the duplictes is the "?print" at the end of each URL (I suppose this is Magento's way of making a printable page version of the same page. I don't know, I didn't build it.) My questions is, how can I get all the pages like this http://www.mycompany.com/blah.html?print to redirect to pages like this... http://www.mycompany.com/blah.html Also, do they NEED to be Canonical, or will a 301 redirect be sufficient. Also, after having done this, if anybody knows, is there a way I can turn that feature off in Magento, because we're expanding our product line, and I don't want to have to keep chasing after these "?print" pages after the fact.
Intermediate & Advanced SEO | | ClifThompson0 -
SeoMoz Crawler Shuts Down The Website Completely
Recently I have switched servers and was very happy about the outcome. However, every friday my site shuts down (not very cool if you are getting 700 unique visitors per day). Naturally I was very worried and digged deep to see what is causing it. Unfortunately, the direct answer was that is was coming from "rogerbot". (see sample below) Today (aug 5) Same thing happened but this time it was off for about 7 hours which did a lot of damage in terms of seo. I am inclined to shut down the seomoz service if I can't resolve this immediately. I guess my question is would there be a possibility to make sure this doesn't happen or time out like that because of roger bot. Please let me know if anyone has answer for this. I use your service a lot and I really need it. Here is what caused it from these error lines: 216.244.72.12 - - [29/Jul/2011:09:10:39 -0700] "GET /pregnancy/14-weeks-pregnant/ HTTP/1.1" 200 354 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)" 216.244.72.11 - - [29/Jul/2011:09:10:37 -0700] "GET /pregnancy/17-weeks-pregnant/ HTTP/1.1" 200 51582 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)"
Intermediate & Advanced SEO | | Jury0