How to determine if my site map needs work?
-
I recently spoke to a consultant at a search conference who took a look at my site map and mentioned it looked like google would have a hard time crawling the site and indexing new pages and changes. I am managing an ecommerce site with a bunch of products, however, I am not an XML expert by any means so i'd appreciate any advice on what to look for in the site map that would possibly be affecting googles ability to crawl/index.
-
I think so. You can always run it through a sitemap validator like this one: http://www.validome.org/google/ I'm sure this will give you everything that you need to know. The 404's are pretty normal as long as you are monitoring them and making sure that they are 404's that you want, or if not you might create 301 redirects, etc.
Hope that helps!
-
thanks for the response..the sitemap is loaded in webmaster tools, and google says there are only a handful of errors (less than 25 and all look like 404 responses due to a page not existing). Based on this, it sounds like all is okay right?
-
Have you loaded the sitemap into Google Webmaster Tools? That's a good starting point because Google will tell you where most of the problems are and give you suggestions on how to fix them. One of the most common things I have seen is having characters in the Product names or descriptions that XML doesn't like, i.e. HTML.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do Wordpress sites outrank SquareSpace?
I was a big fan of Wordpress. I used it for 10 years. However, because I run a very small business, the constant upkeep needed on WP in the end started to frustrate me in the end, so I moved to SquareSpace. However, I am beginning to question my decision, as one of my sites is struggling really badly, and I mean badly. The other sites are okay. So I started asking around, and most people are saying there shouldn't be a difference. A few people have said their Wordpress sites always outranks their SquareSpace sites. Then I read what Rand Fishkin said in the below Twitter thread, now I am even more confused. I am very reluctant to move to Wordpress, its just so much hassle. But at the same time, if a site doesn't get much traffic then it's useless. https://twitter.com/drew_pickard/status/991659074134556673 https://twitter.com/randfish/status/991974456477278209 Please let me know your thoughts and experience.
Web Design | | RyanUK0 -
Needs clarification: How "Disallow: /" works?
Hi all, I need clarification on this. I have noticed we have given "Disallow: /" in one of our sub-directory beside homepage. So, how it going to work now? Will this "Disallow: /" at sub-directory level going to disallow only that directory or entire website? If it is going to work for entire website; we have already given one more Disallow: / at homepage level blocking few folders. How it is going to handle with two Disallow: / commands? Thanks
Web Design | | vtmoz0 -
How much copy do you need on the homepage?
The general rule used to be around 300 words of copy on the homepage, but so many new websites now have very little copy, if any, on the homepage. Has the best practice changed here? If you include keywords in the title and header tags, is that enough to support strong SEO on the homepage...or do you need a few hundred words of copy still? Would love to hear what others think.
Web Design | | KevinBloom1 -
Web Hosting and CDN for Wordpress Site Load Speed - Suggestions Needed
We all know that website load speed is more important than ever. While I love the look and feel of parallax and Wordpress, I want to do everything I can to keep the load speed down. I see a lot of conflicting information regarding web hosting services, CDN services and other service (Cloudflare for example). I am looking to hear from those with their own experiences to let me know what they think is the ideal setup for a parallax Wordpress site is as far as which services to use, including: 1. Web Hosting
Web Design | | Gauge123
2. CDN
3. Any other service or product that would help to provide and extremely fast site load time. Thank you!0 -
Duplicate Content? Designing new site, but all content got indexed on developer's sandbox
An ecommerce I'm helping is getting a complete redesign. Their developer had a sandbox version of their new site for design & testing. Several thousand products were loaded into the sandbox site. Then Google/Bing crawled and indexed the site (because developer didn't have a robots.txt), picking up and caching about 7,200 pages. There were even 2-3 orders placed on the sandbox site, so people were finding it. So what happens now?
Web Design | | trafficmotion
When the sandbox site is transferred to the final version on the proper domain, is there a duplicate content issue?
How can the developer fix this?0 -
URLs appear in Google Webmaster Tools that I can't find on my own site?!?
Hi, I have a Magento e-commerce site (clothing) and when I had a look through some of the sections in Google Webmaster Tools I found URLs that I can't find on my site. For example, a product url maybe http://www.example.co.uk/product-url/ which is fine. In that product there maybe three sizes of the product (Small, Medium, Large) and for some reason Googlebot is sometimes finding a url like: http://www.example.co.uk/product-url/1202/ has been found and when clicked on is a live url (Status code: 200) with is one of the sizes (medium). However I have ran a site crawl in Screaming Frog and other crawl tests and can't seem to find where Googlebot is finding these URLs. I think I need to: 1. Find how Googlebot is finding these urls? 2. Find out how to keep out of index (e.g. robots.txt, canonical etc.... Any help would be much appreciated and I'm happy to share the URL with members if they think they can have a look and help with this problem. I can share specific URLs which might make the issue seem clearer, let me know? Thanks, Darrell
Web Design | | clickyleap0 -
Need some advice on choosing categories.
We have a website where we do a daily one minute video about our son who was born with Down syndrome. When I started the site I was going to do a daily video and put all of those in a category called "Noah's Minute." Cute title, but doesn't really tell anyone what it's about. (Oh what I've learned in the last year.) I was going to do no text on those video posts, just the daily one minute video. I wanted it to tell a story, in order, without me adding to it with words. Then I was going to have some other categories where I wrote information about Down syndrome. Therapy post, medical posts, best toys, parenting tips/encouragement, etc. I'm been running this site for almost a year and I now have a much better idea of the type of content I'll be posting, what people are interested in, etc. I now write content for each of the videos, and no longer group them under "Noah's Minute." If you check out some of the posts, you'll see I try to be very intentional with each posts, and try to make each one centered on a specific topic / key words. I'm now having to go through almost a year of posts that were under "Noah's Minute" and re organize them, however I'm having a problem with coming up with categories for the post. I have some of them under the category of "Therapy" since a lot of our readers are interested in checking out the different posts we do with Noah for his developmental therapy. But the other posts are much more "general" I guess. For instance a lot of our posts are just me telling our story and giving general parenting advice / encouragement. But having a category called "Parenting" seems to vague, and also every post I write could be considered "parenting." I'm wondering if someone would mind checking out some of our content, and giving me some advice on how to organize the posts. There is a lot of great info on our site, and many people ask me questions about things that are on the site, but they just didn't know was there. So I want people to find it better. Also how "detailed" do I have to be in the naming of my categories for SEO purposes? For instance, the category called "Therapy' is great for people who find our site, since it's a given that the category will be dealing with "Down Syndrome Therapy" but do I have to name the category "Down Syndrome Therapy" in order for people to find it via search? If so, that would get old quick to my readers: "Down Syndrome Therapy" "Down Syndrome Toys" Down Syndrome Books" etc.... Anyways, I'm not sure where to go from here. Thoughts, feedback, suggestions?
Web Design | | NoahsDad0 -
Duplicate Content Problem on Our Site?
Hi, Having read the SEOMOZ guide and already worried about this previously, I have decided to look further into this. Our site is 4-5 years old, poorly built by a rouge firm so we have to stick with what we have for now. Were I think we might be getting punished is duplicate content across various pages. We have a Brands page, link at top of page. Here we are meant to enter each brand we stock and a little write up on that brands. What we then put in these write ups is used on each brands item page when we click a brand name on the left nav bar. Or when we click a Product Type (eg. Footwear) then click on a brand filter on the left. So this in theory is duplicate content. The SEO title and Meta Description for each brand is then used on the Brands Page and also on each page with the Brands Product on. As we have entered this brand info, you will notice that the page www.designerboutique-online.com/all-clothing/armani-jeans/ has the same brand description in the scroll box at the top as the page www.designerboutique-online.com/shirts/armani-jeans/ and all the other product type pages. The same SEO title and same Meta descriptions. Only the products change from each one. This then applies to each brand we have (at least 15) across about 8 pages. All with different URLs but the same text. Not sure how a 301 or rel: canonical would work for this, as each URL needs to point at specific pages (eg. shirts, shorts etc...). Some brands such as Creative Recreation and Cruyff only sell footwear, so technically I think??? We could 301 to the Footwear/ URL rather than having both all-clothing and footwear file paths? This surely must be down to the bad design? Could we be losing valulable rank and juice because of this issue? And how would I go about fixing it? I want a new site, but funds are tight. But if this issue is so big that only a new site would fix it, then maybe the money would need to come forward. What do people make of this? Cheers Will
Web Design | | YNWA0