Crawl diagnostic how important is these 2 types of errors and what to do?
-
Hi,
I am trying to SEO optimized my webpage dreamesatehuahin.comWhen I saw SEO Moz webpage crawl diagnostic I kind of got a big surprise due to the high no. of errors. I don’t know if this is the kind of errors that need to be taken very serious i my paticular case,
When I am looking at the details I can see the errors are cause by the way my wordpress theme is put together. I don’t know how to resolve this. But If important I might hire a programmer.
- DUPLICATE ERRORS (40 ISSUES HIGH PRIORITY ACCORDING TO MOZ)
They are all the same as this one.
http://www.dreamestatehuahin.com/property-feature/restaurent/page/2/
is eaqual to this one
http://www.dreamestatehuahin.com/property-feature/restaurent/page/2/?view=list
This one exsist
http://www.dreamestatehuahin.com/property-feature/car-park/
while a level down don’t exsit
http://www.dreamestatehuahin.com/property-feature/- DUPLICATE PAGE TITLE (806 ISSUES MEDIUM PRIORITY ACCORDING TO MOZ)
This is related to search results and pagination.
Etc. Title for each of these pages is the same
http://www.dreamestatehuahin.com/property-search/page/1
http://www.dreamestatehuahin.com/property-search/page/2
http://www.dreamestatehuahin.com/property-search/page/3
http://www.dreamestatehuahin.com/property-search/page/4
- Title element is to long (405)
http://www.dreamestatehuahin.com/property-feature/fitness/?view=list
this is not what I consider real pages but maybe its actually is a page for google.
The title from souce code is auto generated and in this case it not makes sense
<title>Fitness Archives - Dream Estate Hua Hin | Property For Sale And RentDream Estate Hua Hin | Property For Sale And Rent</title>I know at the moment there are properly more important things for our website like content, title, meta descriptions, intern and extern links and are looking into this and taking the whole optimization seriously. Have for instance just hired a content writer rewrite and create new content based on keywords research.
I WOULD REALLY APPRICIATE SOME EXPERIENCE PEOPLE FEEDBACK ON HOW IMPORTANT IS IT THAT I FIX THIS ISSUES IF AT ALL POSSIBLE?
best regards,
Nicolaj
- DUPLICATE ERRORS (40 ISSUES HIGH PRIORITY ACCORDING TO MOZ)
-
Hi Nicolaj,
I am happy I could be of help. by the way, GetFlywheel Can put you in a Singapore data center.
the crawl links you can click on the arrow for more information for instance your pointing Google at a non-indexed page with the canonical tag this is just an example of what you can see
All best,
Thomas
-
Thanks Thomas this was so helpful I really appriciate it.
You have shared some good knowledge and pointed me in the right direction. great tips on articles and tools as well.
best regards,
Nicolaj
-
Hi Nicolaj,
I have done a separate crawl on your site and I have posted information and links below. The answers to your questions.
#1
In terms of duplicate content Google knows that you are trying to use the page with the canonical tag pointing to it you can see that here.
many of the issues you are having are answered by Dan Shure in this excellent post summing up best practices for WordPress SEO
http://moz.com/blog/setup-wordpress-for-seo-success
http://www.dreamestatehuahin.com/property-feature/restaurent/page/2/?view=list
Change in your .htaccess file:
RewriteRule ^(.*)$ /index.php?/$1 [L]
To:
RewriteRule ^(.*)$ /index.php/$1 [L]
http://www.webconfs.com/url-rewriting-tool.php
This will fix huge problems in large sites that can be caused by having that?.
I am talking about very large sites
if using Nginx a faster alternative in my opinion to Apache you would be able to use this tool to rewrite any http://winginx.com/en/htaccess
#1)B
This one exists
http://www.dreamestatehuahin.com/property-feature/car-park/
while a level down don’t exist
http://www.dreamestatehuahin.com/property-feature/
This is an issue where your /property–feature/ is showing me a 404
#2
2_) DUPLICATE PAGE TITLE (806 ISSUES MEDIUM PRIORITY ACCORDING TO MOZ)_
This is related to search results and pagination.
Etc. Title for each of these pages is the same
http://www.seobythesea.com/2011/11/google-granted-patent-hostname-mirrors/
"Paginated pages aren’t pages that contain duplicate content, but will sometimes contain duplicated titles and duplicated meta descriptions based upon things like a content management system that you might be using." Bill Slawski
However they can become a huge issue on larger sites.
Yes they can be a very large problem on big sites if you think about it Google does not get the right signals I have two clients with sites over half a million pages this is one of the largest issues I have ever run across for very big sites.
http://www.slideshare.net/ericenge/pagination-and-seo-making-it-easy
This is a very complicated issue if it is something affecting your site and your crawl budget than I create secular non-pagination pages.
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
Many times it is better to not use pagination and create a single page with a secular title. It depends on your website.
#3
3) Title element is to long (405)
http://www.dreamestatehuahin.com/property-feature/fitness/?view=list
this is not what I consider real pages but maybe its actually is a page for google.
_ The title from source code is auto generated and in this case it not makes sense_
_<title>Fitness Archives - Dream Estate Hua Hin | Property For Sale And RentDream Estate Hua Hin | Property For Sale And Rent</title> _
You should handwrite your title tags and take extreme care in their creation. There are very strong signal to Google.
You have too many words and you have the word property in their and estate as well twice this is spammy
For better results use this guide to writing title tags. Do not allow them to be auto generated.
Please read this http://moz.com/learn/seo/title-tag
Your title tag is too long regardless it is not a wise practice to use as long of a title as you have in there. You use the word property way too much in the title tag you are okay as far as the URL
http://www.dreamestatehuahin.com/property-feature/fitness/?view=list
as it does use a canonical tag it is okay
As far as hiring a programmer it is up to you your site is deeply in need of better coding and hosting your site speed is over 10 seconds. It took me a long time to do any research on your site all. This will kill your conversion if I were browsing this for any other reason other than to help
Just trying to troubleshoot your page takes forever and I thought it was my browser and then my second browser then I did a speed test not to get off subject but get that fixed ASAP
http://tools.pingdom.com/fpt/#!/dMSPsX/http://www.dreamestatehuahin.com/property-feature/fitness/
I would use guides take this to clean up a lot of what is wrong
http://www.feedthebot.com/pagespeed/
In addition I would post it with a managed WordPress hosting company
GetFlywheel, WP engine, Pagely, Pressable, PressLabs & WebSynthesis are all great companies.
GetFlywheel is a fantastic deal at USD15 a site and every site has its very own fully WordPress optimized SSD VPS I have accounts with every company above and that is my opinion.
How are you doing overall getting traffic? How are you doing in converting that traffic into leads?
It would be wise in my opinion to hire a company to help you with the development / programming and SEO.
Let me know if you have any other questions.
Please remember page speed is about pleasing the end-user if they click the back key because your site will not load under 15 seconds ( something it has yet to do in my testing below)
http://tools.pingdom.com/fpt/#!/dugdXo/http://www.dreamestatehuahin.com/
I know speed is a small part of Google's algorithm the reason I am bringing it up is the site is far too slow for normal users to actually browse without leaving. I am certain if you spread your site up your conversion results would be a lot better.
I know at the moment there are properly more important things for our website like content, title, meta descriptions, intern and extern links and are looking into this and taking the whole optimization seriously. Have for instance just hired a content writer rewrite and create new content based on keywords research.
I believe in taking care of the entire site obviously you want to start at the most critical and do things a section at a time so you can see the results. It is fantastic that you have somebody creating content that is very important. What methods did you use to obtain these keywords?
External and internal links are extremely serious without a proper back link profile your site simply will not be seen as important by Google.
Part of what you are talking about is your "title" things like title tags that are too long will affect you because Google will only show a certain amount of pixels I would pay close attention to what this page says along with what the tool shows you in this photo that is larger here http://imgur.com/KFtAY6m.png
title tags are critical, back links are critical, the content you create must be something that people will Share, and like enough to +1 and more importantly link to in order to have real value.
I would suggest a complete site audit
http://www.feedthebot.com/titleandalttags.html
I understand this is twice but use this.
http://moz.com/learn/seo/title-tag
I would recommend using these fantastic tools
Deep Crawl
an incredible tool for finding issues with sites like yours or any site. Up to any size great for huge websites. Because it is hosted on Deep crawls cloud server and not local computer you do not have to worry about your computers RAM ( this only becomes a problem with extremely large sites over 1 million in my experience but it is all depending on your computer of course)
Starts at USD80 a month and will crawl 100,000 URI's however they must be crawled within the term of that month. Packages go up in price by quite a bit after however you get a lot more crawls as well.
Another extremely similar but not cloud-based tool called screaming frog has a free version as well as paid version the free version will crawl up to 500 pages for free and the tool is able to be used on Mac, PC & Ubuntu
The cost is one time cost of 100 British pounds approximately 170 US you do have to renew the license to update, but that is only once a year and it is worth every cent.
The only thing you have to worry about with local installation is your computers specifications most importantly RAM
( this only becomes a problem with extremely large sites but it is all depending on your computer of course)
You can crawl unlimited URI's and your license to update the tool expires after 365 days it is a true bargain.
this is a fantastic guide to doing almost anything with screaming frog guide by SEER Interactive it is valuable in fact using both tools because of their similarities I found this guide applies to both.
http://www.seerinteractive.com/blog/screaming-frog-guide
http://www.screamingfrog.co.uk/seo-spider/
I use that in combination of the tools below Before people think I am a tool only type of person believe me I am not.
They are not designed to do the work for you simply make some of it easier most of the work is done by learning. you can get much more out of using Moz learning, Distilled U, and other great resources than you can by hitting a button but the combination is a synergy.
For complete audits I recommend
Ahrefs, Moz ( all tools), MajesticSEO AuthorityLabs, SERPS,Deep Crawl Screaming Frog Brightedge, AnalyticsSEO, Searchmetrics, Raven, SEMRush & Marin Software.
kind of overkill but I believe they all add something
http://deepcrawl.co.uk/use-cases/architecture-optimisation
Make sure that the keyword research is done by somebody who knows what they are doing. You have a site that needs a lot of love and care, but definitely is salvageable.
you have to fix your XML site map considering you are using Yoast I would use that site map over the one you are using. You are very few links in your XML site map
shown here
http://www.dreamestatehuahin.com/sitemap.xml
this is a summary of the crawl
https://blueprintseo.sharefile.com/d/sb3882a2f46646d49
all links below are to give you insight into the crawl.
I hope I have been of help,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Suggested Screaming Frog configuration to mirror default Googlebot crawl?
Hi All, Does anyone have a suggested Screaming Frog (SF) configuration to mirror default Googlebot crawl? I want to test my site and see if it will return 429 "Too Many Requests" to Google. I have set the User Agent as Googlebot (Smartphone). Is the default SF Menu > Configuration > Speed > Max Threads 5 and Max URLs 2.0 comparable to Googlebot? Context:
Intermediate & Advanced SEO | | gravymatt-se
I had tried NetPeak SEO Spider which did a nice job and had a cool feature that would pause a crawl if it got to many 429. Long Story short, B2B site threw 429 Errors when there should have been no load on a holiday weekend at 1:00 AM.0 -
Competitor on same server (so 2 domains in same branche on same server, also same technique)
Hi, To holding of a client of ours, has bought the webshop from a competitor. They have moved the domain of the competitor to their own server, and also changed the technique so both sites have the same CMS and also same technique on the front-end.
Intermediate & Advanced SEO | | Dennis1992038
How bad is this for SEO? Should they change from server ASAP and are there solutions to do stay on the same server but use something like an CDN? Looking forward to your thoughts. Thanks!0 -
How can I stop my facets being crawled?
Hi If my facets are being crawled, how can I stop this? Or set them up so they are SEO friendly - this is new to me as I haven't had to deal with lots of facets in the past. Here's an example of a page on the site - https://www.key.co.uk/en/key/lift-tables Here's an example of a facet URL - https://www.key.co.uk/en/key/lift-tables#facet:-1002779711011711697110,-700000000000001001651484832107103,-700000000000001057452564832109109&productBeginIndex:0&orderBy:5&pageView:list& I've been trying to read up on URL parameters etc, I'm new to it so it's taking some time to understand Any advice would be great!
Intermediate & Advanced SEO | | BeckyKey0 -
News Errors In Google Search Console
Years ago a site I'm working on was publishing news as one form of content on the site. Since then, has stopped publishing news, but still has a q&a forum, blogs, articles... all kinds of stuff. Now, it triggers "News Errors" in GWT under crawl errors. These errors are "Article disproportionately short" "Article fragmented" on some q&a forum pages "Article too long" on some longer q&a forum pages "No sentences found" Since there are thousands of these forum pages and it's problem seems to be a news critique, I'm wondering what I should do about it. It seems to be holding these non-news pages to a news standard: https://support.google.com/news/publisher/answer/40787?hl=en For instance, is there a way and would it be a good idea to get the hell out of Google News, since we don't publish news anymore? Would there be possible negatives worth considering? What's baffling is, these are not designated news urls. The ones we used to have were /news/title-of-the-story per... https://support.google.com/news/publisher/answer/2481373?hl=en&ref_topic=2481296 Or, does this really not matter and I should just blow it off as a problem. The weird thing is that we recently went from http to https and The Google News interface still has us as http and gives the option to add https, which I am reluctant to do sine we aren't really in the news business anymore. What do you think I should do? Thanks!
Intermediate & Advanced SEO | | 945010 -
Best way for Google and Bing not to crawl my /en default english pages
Hi Guys, I just transferred my old site to a new one and now have sub folder TLD's. My default pages from the front end and sitemap don't show /en after www.mysite.com. The only translation i have is in spanish where Google will crawl www.mysite.com/es (spanish). 1. On the SERPS of Google and Bing, every url that is crawled, shows the extra "/en" in my TLD. I find that very weird considering there is no physical /en in my urls. When i select the link it automatically redirects to it's default and natural page (no /en). All canonical tags do not show /en either, ONLY the SERPS. Should robots.txt be updated to "disallow /en"? 2. While i did a site transfer, we have altered some of the category url's in our domain. So we've had a lot of 301 redirects, but while searching specific keywords in the SERPS, the #1 ranked url shows up as our old url that redirects to a 404 page, and our newly created url shows up as #2 that goes to the correct page. Is there anyway to tell Google to stop showing our old url's in the SERP's? And would the "Fetch as Google" option in GWT be a great option to upload all of my url's so Google bots can crawl the right pages only? Direct Message me if you want real examples. THank you so much!
Intermediate & Advanced SEO | | Shawn1240 -
How can Google index a page that it can't crawl completely?
I recently posted a question regarding a product page that appeared to have no content. [http://www.seomoz.org/q/why-is-ose-showing-now-data-for-this-url] What puzzles me is that this page got indexed anyway. Was it indexed based on Google knowing that there was once content on the page? Was it indexed based on the trust level of our root domain? What are your thoughts? I'm asking not only because I don't know the answer, but because I know the argument is going to be made that if Google indexed the page then it must have been crawlable...therefore we didn't really have a crawlability problem. Why Google index a page it can't crawl?
Intermediate & Advanced SEO | | danatanseo0 -
Robot.txt error
I currently have this under my robot txt file: User-agent: *
Intermediate & Advanced SEO | | Rubix
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspx WebMatrix 2.0 On webmaster > Health Check > Blocked URL I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL: User-agent: * Disallow: / WebMatrix 2.0 Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas? Thanks Seda0