Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How would you create and then segment a large sitemap?
-
I have a site with around 17,000 pages and would like to create a sitemap and then segment it into product categories.
Is it best to create a map and then edit it in something like xmlSpy or is there a way to silo sitemap creation from the outset?
-
Thanks Saijo,
We are trying to silo product types/categories and break them into different sitemaps. I'm familiar with SF but I don't think it will create sitemaps with the granularity that we are looking for.
I'm using XMLSpy but I'm finding it hard to break out blocks of content.
-
To my knowledge, Screaming Frog doesn't allow you to create an XML sitemap. Perhaps Excel allows you to format the output from SF but I'm not sure. I did find a utility called XMLSpy which, though pricey, allows me to do some of the sorting I was looking for. Once sorted, I can manually pull out sections to segment my sitemap. It is a pain in the neck because I can determine a silo and do it automatically. That being said, I think I can develop a sitemap template and have our new web programmer to develop a way to auto generate a group of segmented sitemaps.
Anyone know if there is a canned solution that works with IIS?
-
If you site is structured such that the urls contain the categories you wish to sort , you can use something like Screaming Frog ( http://www.screamingfrog.co.uk/seo-spider/ ) and export all the urls and sort them out via excel in to categories and go that way
NOTE : the free version has a 500 url limit, so you might want to look at paid ( ask them if it can handle 17,00 urls before getting it ) or look at http://home.snafu.de/tilman/xenulink.html ( I haven't used it myself , so don't know if you can export stuff to excel from there )
Good luck mate , sounds like you have a big job ahead of you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
Hello everyone, I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points: The website is based on a subdirectory URL ( example.com/ja/ ) The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji. Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed). When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either. My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this. When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded. I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps? Many thanks everyone for any advice!
Technical SEO | | Hermski0 -
My video sitemap is not being index by Google
Dear friends, I have a videos portal. I created a video sitemap.xml and submit in to GWT but after 20 days it has not been indexed. I have verified in bing webmaster as well. All videos are dynamically being fetched from server. My all static pages have been indexed but not videos. Please help me where am I doing the mistake. There are no separate pages for single videos. All the content is dynamically coming from server. Please help me. your answers will be more appreciated................. Thanks
Technical SEO | | docbeans0 -
Which Sitemap to keep - Http or https (or both)
Hi, Just finished upgrading my site to the ssl version (like so many other webmasters now that it may be a ranking factor). FIxed all links, CDN links are now secure, etc and 301 Redirected all pages from http to https. Changed property in Google Analytics from http to https and added https version in Webmaster Tools. So far, so good. Now the question is should I add the https version of the sitemap in the new HTTPS site in webmasters or retain the existing http one? Ideally switching over completely to https version by adding a new sitemap would make more sense as the http version of the sitemap would anyways now be re-directed to HTTPS. But the last thing i can is to get penalized for duplicate content. Could you please suggest as I am still a rookie in this department. If I should add the https sitemap version in the new site, should i delete the old http one or no harm retaining it.
Technical SEO | | ashishb010 -
Removing images from site and Image Sitemap SEO advice
Hello again, I have received an update request where they want me to remove images from this site (as of now its a bunch of thumbnails) current page design: http://1stimpressions.com/portfolio/car-wraps/ and turn it into a new design which utilized a slider (such as this): http://1stimpressions.com/portfolio/ They don't want the thumbnails on the page anymore. My question is since my site has a image sitemap that has been indexed will removing all the images hurt my SEO greatly? What would the recommended steps to take to reduce any SEO damage be, if so? Thank you again for your help, always great and very helpful feedback! 🙂 cheers!
Technical SEO | | allstatetransmission0 -
Best XML Sitemap generator
Do you guys have any suggestions on a good XML Sitemaps generator? hopefully free, but if it's good i'd consider paying I am using a MAC so would prefer a online or mac version
Technical SEO | | kevin48030 -
Ror.xml vs sitemap.xml
Hey Mozzers, So I've been reading somethings lately and some are saying that the top search engines do not use ror.xml sitemap but focus just on the sitemap.xml. Is that true? Do you use ror? if so, for what purpose, products, "special articles", other uses? Can sitemap be sufficient for all of those? Thank you, Vadim
Technical SEO | | vijayvasu0 -
How do I create a Video Sitemap for Youtube Embedded Videos?
I've been seeing a lot of people recommend creating a video sitemap or Media RSS feed (mRSS) and submit to Google. We have videos hosted on Brightcove and most on YouTube. Brightcove can generate the sitemap for us. But does anyone know how to generate a YouTube Video Sitemap for those videos embedded on our pages? Note: I realize I could manually assemble the video sitemap, however manually assembling the sitemap is probably not an option for us due to the volume of videos we've published.
Technical SEO | | LDS-SEO1 -
Does 'framing' a website create duplicate content?
Something I have not come across before, but hope others here are able offer advice based on experience: A client has independently created a series of mini-sites, aimed at targeting specific locations. The tactic has worked very well and they have achieved a large amount of well targeted traffic as a result. Each mini-site is different but then in the nav, if you want to view prices or go to the booking page, that then links to what at first appears to be their main site. However, you then notice that the URL is actually situated on the mini-site. What they have done is 'framed' the main site so that it appears exactly the same even when navigating through this exact replica site. Checking the code, there is almost nothing there - in fact there is actually no content at all. Below the head, there is a piece of code: <frameset rows="*" framespacing=0 frameborder=0> <frame src="[http://www.example.com](view-source:http://www.yellowskips.com/)" frameborder=0 marginwidth=0 marginheight=0> <noframes>Your browser does not support frames. Click [here](http://www.example.com) to view.noframes> frameset> Given that main site content does not appear to show in the source code, do we have an issue with duplicate content? This issue is that these 'referrals' are showing in Analytics, despite the fact that the code does not appear in the source, which is slightly confusing for me. They have done this without consultation and I'm very concerned that this could potentially be creating duplicate content of their ENTIRE main site on dozens of mini-sites. I should also add that there are no links to the mini-sites from the main site, so if you guys advise that this is creating duplicate content, I would not be worried about creating a link-wheel if I advise them to link directly to the main site rather than the framed pages. Thanks!
Technical SEO | | RiceMedia0