Xml Sitemap
-
Hi mozzers,
I am about to submit a sitemap for one of my clients via webmaster tools. The issue is that I have way too many urls that I don't want them to be indexed by Google such as testing pages, auto generated pages...
Is there way to remove certain URL from the XML sitemap or is this impossible?
If impossible, is the only way to control these urls is to "No index" all these pages that i don't want the search engine to see?
Thanks Mozzers,
-
That is correct, you just submit as you would normally. There are two ways to submit the file:
-
Via the webmaster tools interface. Have you created your webmaster tools account yet? Optimization -> Sitemaps -> Add Sitemap
-
By referencing it in your robots.txt. Just add the following on a new line: Sitemap: http://www.yourdomain.com/sitemap.xml
-
-
Hi Greg,
Since I am not an expert into sitemaps yet, once i finish removing URLs I don t want, should I just save the text editor document and then how do I submit this doc into webmaster tool?
Is it just "add sitemap" and put the name of the doc "www.example.com/sitemap.xml"? or is there another manipulation I should be aware of?
Thank you,
-
Great question! You can manually remove all pages from the sitemap, by opening it up in a text editor of your choice, removing offending entries, and saving the file. Make sure you use a simple text editor like notepad on windows or textwrangler on mac.
It is best to do this before your submission, rather then add pages which you know you don't want indexed and then have to ask google to remove them.
-
You should absolutely be able to exercise complete control over what URLs are contained in your site map. It is dependent upon your sitemap software. There are hundreds of software solutions available.
Regardless of the site map, you should definitely no index the pages you do not wish to appear in search results. A robots.txt entry is definitely not the best solution.
-
Like you mentioned, you could either use robots.txt, or submit a URL removal request through webmaster tools. I've used both methods.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap Contains Blocked Resources
Hey Mozzers, I have several pages on my website that are for user search purposes only. They sort some products by range and answer some direct search queries users type into the site. They are basically just product collections that are else ware grouped in different ways. As such I didn't wants SERPS getting their hands on them so blocked them in robots so I could add then worry free. However, they automatically get pulled into the sitemap by Magento. This has made Webmaster tools give me a warning that 21 urls in the sitemaps are blocked by robots. Is this terrible SEO wise? Should I have opted to NOINDEX these URLS instead? I was concerned about thin content so really didnt want google crawling them.
Technical SEO | | ATP0 -
Sitemap_index.xml = noindex,follow
I was running a rapport with Sreaming Frog SEO Spider and i saw: (Tab) Directives > NOindex : https://compleetverkleed.nl/sitemap_index.xml/ is set on X-Robots-Tag 1 > noindex,follow Does this mean my sitemap isn't indexed? If anyone has some more tips for our website, feel free to give some suggestions 🙂 (Website is far from complete)
Technical SEO | | Happy-SEO2 -
Some URLs in the sitemap not indexed
Our company site has hundreds of thousands of pages. Yet no matter how big or small the total page count, I have found that the "URLs Indexed" in GWMT has never matched "URLS in Sitemap". When we were small and now that we have a LOT more pages, there is always a discrepancy of ~10% or so missing from the index. It's difficult to know which pages are not indexed, but I have found some that I can verify are in the Sitemap.xml file but not at all in the index. When I go to GWMT I can "Fetch and Render" missing pages fine - it's not as though it's blocked or inaccessible. Any ideas on why this is? Is this type of discrepancy typical?
Technical SEO | | Mase0 -
How would I make an image sitemap for a site with 50k images?
Is it worth creating an image sitemap? I read an image sitemap can only have 1k entries would I need to make 50 different ones?
Technical SEO | | EcommerceSite0 -
Multiple sitemaps for various media?
Hello, We have always included videos, pages, and images in the same sitemap.xml file, and after reading through the Google sitemap info page, I am wondering if we should break those up into respective types of sitemaps (i.e one for video, one for images, etc)? If this is so, how to name the files and submit them? And then, should I submit a sitemap.xml directory sitemap? Note: we have the normal amount of images, videos, pages..not an ecommerce site. Thanks in advance 🙂
Technical SEO | | lfrazer0 -
Best XML Sitemap Generator for Mac?
Hi all, Recently moved from PC to Mac when starting a new job. One of the things I'm missing from my PC is G Site Crawler, and I haven't yet found a decent equivalent for the Mac. Can anybody recommend something as good as G Site Crawler for the Mac? I.e. I need the flexibility to exclude by URL parameter etc etc. Cheers everyone, Mark
Technical SEO | | markadoi840 -
Exclude Child URLs from XML Sitemap Generator (Wordpress)
Hi all, I was recommended the XML Sitemap Generator for Wordpress by the very helpful Keith Bloemendaal and John Pring - however I can't seem to exclude child URLs. There is a section Exclude items and a subsection Exclude posts. I have tried inputting the URLs for the pages I don't want in the sitemap, however that didn't work. So I read that you have to include a list of "IDs" - not sure where on earth to find that info, tried the page name and the post= number from the URL, however neither worked. I hope somebody can point me in the right direction - and apologies, I am a Wordpress novice, and I got no answers from the Wordpress forums so turned right back to SEOmoz! Cheers.
Technical SEO | | markadoi840 -
How do I create a Video Sitemap for Youtube Embedded Videos?
I've been seeing a lot of people recommend creating a video sitemap or Media RSS feed (mRSS) and submit to Google. We have videos hosted on Brightcove and most on YouTube. Brightcove can generate the sitemap for us. But does anyone know how to generate a YouTube Video Sitemap for those videos embedded on our pages? Note: I realize I could manually assemble the video sitemap, however manually assembling the sitemap is probably not an option for us due to the volume of videos we've published.
Technical SEO | | LDS-SEO1