Where are the crawled URLS in webmaster tools coming from?
-
When looking at the crawl errors in Webmaster Tools/Search Console, where is Google pulling these URLs from? Sitemap?
-
Just to make complete. Google search console will list errors for pages with links coming from 3 general location
-
Crawling links on your website. Starting from somewhere on your site and going link to link.
-
Crawling links in your sitemap.
-
Crawling URLs from your site that do not exist anymore on your site or sitemap. I have seen Google keep things in memory and come back to hit pages again that are no longer from option 1 or option 2. If you used to have a bunch of 301 directs in place for an old version of your website and then your developer changes something to delete all those 301s and they become 404s, you will find those pages showing up as errors again. This is really useful as it can help diagnose the issue and you can fix it.
-
Crawling links from other sites. Sometimes, this is how links get crawled for #3.
Here is what really sucks about Search Console and I mean sucks big bananas if you are trying to diagnose an issue. If you look at your Search Console error page. You can click on the URL in the report, it will pop up a box and then you can click the tab "Linked From" and see what pages are linking to the URL in question. That is good! If you then download the CSV, all of that info is lost. If you have more than 20 errors to deal with, you do not have a practical way to manage things and see if there is a trend etc. Otherwise you are left with clicking a lot of links in the report and taking lots of notes and going a little insane.
Good luck!
-
-
I agree 100% with Dirk!
Google is going to crawl your site as long as your as long as you have a domain's robots.txt file and Meta tag robots allow for the bot to crawl the site. By not telling Google anything you are saying welcome to my site please index
Google Webmaster tools are doing you a favor and saying look this is a problem that the bot has encountered while indexing your site look into it.
Submitting a XML sitemap to Google will definitely help show them where to look and you can request that they index using crawl is a Googlebot.
Some good advice on how to fix the issues found
**https://www.distilled.net/blog/seo/indexation-problems-diagnosis-using-google-webmaster-tools/ **
a great resource on indexing & robots.txt
https://www.distilled.net/blog/seo/advanced-seo-troubleshooting-why-isnt-this-page-indexed/
https://varvy.com/robottxt.html
I hope this helps,
Tom
-
These emors are the problems the googlebot encounters while crawling your site. A site map can help the googlebot to better crawl your site but isn't strictly necessary .
rgds
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
50 Duplicate URLS, but not the same
Hi According to my latest site crawl, many of my pages are showing up to 50 duplicate urls. However this isn't the case in real life. http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/hitachi/ex-33mu.html is showing 31 duplicate URL. Examples include: http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/parts/x430.html
Technical SEO | | JDadd
http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/case/cx-75sr.html Obviously these URL's are very similar and I know that Moz judges URLs by 90% of their similarity, but is this affecting my actual raking on google? If so, what can I do? This pages are also very similar in code and content, so they are also showing as duplicate content etc as well. Worried that this is having an affect on my SERP rankings, as this pages arent ranking particularly well. Thanks, Ellie0 -
Changes to 'links to your site' in WebMaster Tools?
We're writing more out of curiosity... Clicking on "Download latest links" within 'Links to your site' in Google's WebMaster Tools would usually bring back links discovered recently. However, the last few times (for numerous accounts) it has brought back a lot of legacy links - some from 2011 - and includes nothing recent. We would usually expect to see a dozen at least each month. ...Has anyone else noticed this? Or, do you have any advice? Thanks in advance, Ant!
Technical SEO | | AbsoluteDesign0 -
Link building with AddThis URL
We've begun using AddThis for tracking our social sharing. AddThis has been adding the snippet to the end of the URLs on our pages and we've been finding that people linking to us are linking to the URL with the snippet. AddThis says this isn't a problem for SEO. Is this correct? Here is an example: https://www.harborcompliance.com/information/how-to-start-a-non-profit-organization-in-colorado.php#.UunCfPldVig I want to make sure this is not affecting our SEO in any way, particularly that Google would see this as an affiliate or paid link since it has the "#". I may be crazy but I just want to make sure!
Technical SEO | | Harbor_Compliance0 -
Has Google Stopped Listing URLs with Crawl Errors in Webmaster Tools?
I went to Google Webmaster Tools this morning and found that one of my clients had 11 crawl errors. However, Webmaster Tools is not showing which URLs are having experiencing the errors, which it used to do. (I checked several other clients that I manage and they list crawl errors without showing the specific URLs. Does anyone know how I can find out which URLs are experiencing problems? (I checked with Bing Webmaster Tools and the number of errors are different).
Technical SEO | | TopFloor0 -
Difference between SEOMOZ and Webmaster Tools information
Hello, There is an issue that confuses me and I thought perhaps you will be able to help me shed some light on it. I have a website which shows 2,549 crawled pages on SEOMOZ and 24,542 pages on webmaster tools! Obviously there is some technical issue with the site, but my question is: why the vast difference between what the SEOMOZ crawl report and webmaster tools report show? Thanks! Guy Cizner
Technical SEO | | ciznerguy0 -
Formatting dynamic urls?
We have a long-time previously well-established website that was hit by panda. On one section of the site, we have dynamic urls that include %20 in them (e.g. North%20America). It's recently come to our attention that google has both a version of the url with a plus sign (+) and the version with the %20 (space) (e.g. North+America). Upon researching this, it seems that a hyphen (-) is preferable to either of the above. We obviously need to remove the %20's from the urls as they can cause issues. So, should we stick with the + sign since it's already indexed and ranking or do a 301 rewrite and change them all to hyphens instead of the plus sign? This is the one section of the site that has maintained rankings through the panda debacle, so we need to take that into consideration as we don’t want to lose the rankings that we have. Along the same lines, we have two other sections of the site that provide search results as well, though these are all formatted to use a plus sign. Is it advisable to do a 301 rewrite to change the plus signs to hyphens on these as well or just leave them alone? This particular section has lost rankings over the last year with panda updates.
Technical SEO | | Odjobob0 -
Should we block URL param in Webmaster tools after URL migration?
Hi, We have just released a new version of our website that now has a human readable nice URL's. Our old ugly URL's are still accessible and cannot be blocked/redirected. These old URL's use a URL param that has an xpath like expression language to define the location in our catalog. We have about 2 million pages indexed with this old URL param in it while we have approximately 70k nice URL's after the migration. This high number of old URL's is due to facetting that was done using this URL param. I wonder if we should now completely block this URL param from Google Webmaster tools so that these ugly URL's will be removed from the Google index. Or will this harm our position in Google? Thanks, Chris
Technical SEO | | eCommerceSEO0 -
Keyword Difficulty Tool
Hi Mozzers! Randfishkin just posted yesterday a very nice important and helpfull post, about keyword difficulty. I will be happy, if you can write here the metrics from reports of keyword difficulty, to know more about position of our website on SERP, and to know more what to engage if someone is ranking higher than me, with same metrics of the report of keyword difficulty. It would be very nice, if we talk on this topic here about keyword difficulty how to's. Thanks
Technical SEO | | leadsprofi0