How to create site map for large site (ecommerce type) that has 1000's if not 100,000 of pages.
-
I know this is kind of a newbie question but I am having an amazing amount of trouble creating a sitemap for our site Bestride.com. We just did a complete redesign (look and feel, functionality, the works) and now I am trying to create a site map. Most of the generators I have used "break" after reaching some number of pages. I am at a loss as to how to create the sitemap. Any help would be greatly appreciated!
Thanks
-
I agree with Chris. With such large websites it would be advisable having a sitemap index and then splitting the index into various individual indexes such as Pages, Products, Categories, images, media, tags etc.
-
The easiest thing i can think of is to write a script that works with your dispatcher to create a site map. The format I would use is add the page and all of the "product images" on the page to the map and move to the next. At the same time I would use an auto increment variable to keep track of how many lines you have written. When you get around 50k, write out the name of the next site map file that the program will create and have them chained together this way.
-
That's a great help Chris, thank you! And thanks to all for your help!
-
Typically, a sitemap is going to include every page on the site. As Francesca said, each sitemap can be up to 50K urls and if you need multiple sitemaps then you create a sitemap index that points to the rest of the sitemaps.
-
Thanks for the feedback!
I will look into screamingfrog for sure.
@Lesley - we are using a custom platform (in house) so we don't have that functionality. The issue is that we have a lot of inventory (millions) of cars. We have built (and are releasing new functionality today) to provide internal links so that Google can crawl all the inventory easily (users can too :). My question about sitemaps has boiled down to this: Do we need to build the sitemap to include every single page (all the inventory) or do we provide a "map" so that google can find the top pages and then crawl the inventory from there. Again the site is bestride.com. If anyone wants to take a look at the site, that would be fantastic!
Thanks
-
Are you using a custom platform or an off the shelf e-commerce package? Most off the shelf packages actually have a module that can create a site map and a lot have it where you can cron it too.
-
Of course, you can also use the moz's crawl test report at http://pro.moz.com/tools/crawl-test
-
Hi Kristin,
Each sitemap.xml can support maximum 50.000 URLs. So, If you have a site with more than 100K, It'd be better to create 2 or 3 o 4 etc sitemaps.xml in order to contain all URLs. Hope it is useful.
Kind regards!
Francesca
-
You can use screamingfrog to create your sitemap. You just need to license it for crawl more than 500 URI.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site's meta description is not being shown in Google Search results. Instead our privacy policy is getting indexed.
We re-launched our new site and put in the re-directs. Our site is https://www.fico.com/en. When I search for "fico" in Google. I see the privacy policy getting indexed as meta descriptions instead of our actual meta description. I have edited the meta description, requested Google to re-index our site. Not sure what to do next? Thanks for your advise.
Technical SEO | | gosheen0 -
How can a keyword placed on a page with the Moz page optimization score of 100 be ranked #51+?
Hi, Please help me figure out why this is happening and what goes wrong. This is the example of the poor ranked keyword - 'viking cooktop repair' with page optimization score of 100 (http://www.yourappliancerepairla.com/blog/viking-cooktop-repair/) Yet it's ranking is #51+. I've got many like these: Page Optimization Score for 'kitchenaid oven repair' is 100 (http://www.yourappliancerepairla.com/blog/kitchenaid-oven-repair/) yet its ranking is #51+ And so on. According to Google Search Console, I have 266 of links to my site with variety of root domains. While building backlinks, I paid attention to relevancy and DA.What else do I have to do to get those keywords ranked higher? And why don't they rank well if the pages are 100% optimized, not keywords stuffed and I have quality backlinks? What am I missing out on? Please help!
Technical SEO | | kirupa1 -
Why are only PDFs on my client's site being indexed, and not actual pages?
My client has recently built a new site (we did not build this), which is a subdomain of their main site. The new site is: https://addstore.itelligencegroup.com/uk/en/. (Their main domain is: http://itelligencegroup.com/uk/) This new Addstore site has recently gone live (in the past week or so) and so far, Google appears to have indexed 56 pdf files that are on the site, but it hasn't indexed any of the actual web pages yet. I can't figure out why though. I've checked the robots.txt file for the site which appears to be fine: https://addstore.itelligencegroup.com/robots.txt. Does anyone have any ideas about this?
Technical SEO | | mfrgolfgti0 -
When i type site:jamalon.com to discover number of pages indexed it gives me different result from google web master tools
when i type site:jamalon.com to discover number of pages indexed it gives me different result from google web master tools
Technical SEO | | Jamalon0 -
One page of the site disappeared from serp for a month now
Im working on a clients site and been promoting a specific page to a keyword. started to move up the ranks and exactly a month ago on the 19/5 ( on the same day of the last update) updated the main page im working on with new content and published some other new pages on related subjects that all are linking to the main page im working on ( without the same anchor text in the links ) on the same day i found out that because of a technical error the new content was published on 5 other pages of the site and obviously created a duplicate content issue and i removed all the duplicates on the same day , i assume G caught this thing and punished the site for the duplicate content issue but : when i search the page directly with site:...i can find it. its been a month since i fixed all issues that i thought could impact the page..no duplicate content on the site. no KW stuffing. no spammy links to the page. everything seems fine now my question : why is my page not showing ? how long should i wait before giving up and creating a new page .? how come my site has not lost any organic traffic ( apart from that specific page ) ? is it possible to penalize only one page ? can i recover from this at all ? thanks
Technical SEO | | nira0 -
Duplicate page titles on Ecommerce
Hi, My question is in reference to an E-commerce site- Our SEO MOZ scan is showing many errors for Duplicates- such as Duplicate titles - The majority of these are on the products map- and the page titles are Products Map :: Company Name How do we get correct this or does Google not penalize for it? Thanks.
Technical SEO | | frankrizzo0 -
Internal Links on eCommerce sites
I have been working on an eCommerce site; www.pretavoir.co.uk over the past year. Improvements in SERPs have been good with many top three positions. However, there are other important keywords of similar difficulty which refuse to behave in a similar way.... The site is PR4 and has a homepage PA 52. The homepage includes links to internal brand pages eg Prada, Gucci etc. Q Would it be worthwhile creating footer anchor text with eaxct text eg Prada sunglasses, Gucci Sunglasses?? Thanks
Technical SEO | | seanmccauley0 -
Should I create mini-sites with keyword rich domain names pointing to my main site?
Hi, I'm new to seomoz (and seo in general) and loving it so far. My main domain name is more of a brandname than a search engine friendly list of keywords. I rank well for some keywords I optimized for, and less so for the more competitive keywords. I was wondering if making one page minisites hosted on keyword rich domain names could help in this respect? What I want to do is just have a single page with a few paragraphs of content and links to the main site. I am not looking for links to boost the main site, just for the minisites to do better for several keywords. Will this help? Is this ok, or against some Google policy? Can this hurt the main site rankings? Thank you! **Edit: **I noticed that sites ranking above me on the first page for some keywords have much less on-page elements than my page, have about the same domain trust and also very little inbound links. The only factor I can see is the exact match of keywords in the domain name.
Technical SEO | | Eladla1