Understanding the actions needed from a Crawl Report
-
I've just joined SEOMOZ last week and have not even received my first full-crawl yet, but as you know, I do get the re-crawl report. It shows I have 50 301's and 20 rel canonical's. I'm still very confused as to what I'm supposed to fix...And, all the rel canonical's are my sites main pages, so hence I am still equally confused as to what the canonical is doing and how do I properly setup my site. I'm a technical person and can grasp most things fairly quickly, but on this the light bulb is taking a little while longer to fire-up
If my question wasn't total jibberish and you can help shed some light, I would be forever grateful.
Thank you.
-
Thanks Charles I'm really happy with him
-
Thanks Woj - it helps..a little :). SEO is definitely a journey...
On another note, I just read the post on your company website regarding your process of developing the Kwasi robot logo - very interesting read, I enjoyed it.
-
The 301s are warnings and could be in place for a reason - you can also download a spreadsheet with all the crawl findings.. it's really useful.
Generally, fix all the errors (in red) if any.. fix warnings as required & examine the notices
For example, I have a site that has 100+ canonicals - all fine & a couple of warnings (titles too long but only over by 1 or 2 characters)
Hope that helps a little
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Understanding Redirects and Canonical Tags in SEO: A Complex Case
Hi everyone, nothing serious here, i'm just playing around doing my experiments 🙂
Technical SEO | | chueneke
but if any1 of you guys understand this chaos and what was the issue here, i'd appreciate if you try to explain it to me. I had a page "Linkaufbau" on my website at https://chriseo.de/linkaufbau. My .htaccess file contains only basic SEO stuff: # removed ".html" using htaccess RewriteCond %{THE_REQUEST} ^GET\ (.*)\.html\ HTTP RewriteRule (.*)\.html$ $1 [R=301,L] # internally added .html if necessary RewriteCond %{REQUEST_FILENAME}.html -f RewriteCond %{REQUEST_URI} !/$ RewriteRule (.*) $1\.html [L] # removed "index" from directory index pages RewriteRule (.*)/index$ $1/ [R=301,L] # removed trailing "/" if not a directory RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ RewriteRule (.*)/ $1 [R=301,L] # Here’s the first redirect: RedirectPermanent /index / My first three questions: Why do I need this rule? Why must this rule be at the top? Why isn't this handled by mod_rewrite? Now to the interesting part: I moved the Linkaufbau page to the SEO folder: https://chriseo.de/seo/linkaufbau and set up the redirect accordingly: RedirectPermanent /linkaufbau /seo/linkaufbau.html I deleted the old /linkaufbau page. I requested indexing for /seo/linkaufbau in the Google Search Console. Once the page was indexed, I set a canonical to the old URL: <link rel="canonical" href="https://chriseo.de/linkaufbau"> Then I resubmitted the sitemap and requested indexing for /seo/linkaufbau again, even though it was already indexed. Due to the canonical tag, the page quickly disappeared. I then requested indexing for /linkaufbau and /linkaufbau.html in GSC (the old, deleted page). After two days, both URLs were back in the serps:: https://chriseo.de/linkaufbau https://chriseo.de/linkaufbau.html this is the new page /seo/linkaufbau
b14ee095-5c03-40d5-b7fc-57d47cf66e3b-grafik.png This is the old page /linkaufbau
242d5bfd-af7c-4bed-9887-c12a29837d77-grafik.png Both URLs are now in the search results and all rankings are significantly better than before for keywords like: organic linkbuilding linkaufbau kosten linkaufbau service natürlicher linkaufbau hochwertiger linkaufbau organische backlinks linkaufbau strategie linkaufbau agentur Interestingly, both URLs (with and without .html) redirect to the new URL https://chriseo.de/seo/linkaufbau, which in turn has a canonical pointing to https://chriseo.de/linkaufbau (without .html). In the SERPs, when https://chriseo.de/linkaufbau is shown, my new, updated snippet is displayed. When /linkaufbau.html is shown, it displays the old, deleted page that had already disappeared from the index. I have now removed the canonical tag. I don't fully understand the process of what happened and why. If anyone has any ideas, I would be very grateful. Best regards,
Chris0 -
Google crawling but not indexing for no apparent reason
Client's site went secure about two months ago and chose root domain as rel canonical (so site redirects to https://rootdomain.com (no "www"). Client is seeing the site recognized and indexed by Google about every 3-5 days and then not indexed until they request a "Fetch". They've been going through this annoying process for about 3 weeks now. Not sure if it's a server issue or a domain issue. They've done work to enhance .htaccess (i.e., the redirects) and robots.txt. If you've encountered this issue and have a recommendation or have a tech site or person resource to recommend, please let me know. Google search engine results are respectable. One option would be to do nothing but then would SERPs start to fall without requesting a new Fetch? Thanks in advance, Alan
Technical SEO | | alankoen1230 -
Need to de-index certain pages fast
I need to de-index certain pages as fast as possible. These pages are already indexed. What is the fastest way to do this? I have added the noindex meta tag and run a few of the pages through Search Console/Webmaster tools (fetch as google) earlier today, however nothing has changed yet. The 'fetch as google' services do see the noindex tag, but it haven't changed the SERPs yet. I now I should be patient, but if there is a faster way to get Google to de-index these pages, I want to try that. I am considering the removal tool also, but I'm unsure if that is risky to do. And even if it's not, I can understand it's not a permanent solution anyway. What to do?
Technical SEO | | WebGain0 -
Www vs non www - Crawl Error 902
I have just taken over admin of my company website and I have been confronted with crawl error 902 on the existing campaign that has been running for years in Moz. This seems like an intermittent problem. I have searched and tried to go over many of the other solutions and non of them seem to help. The campaign is currently set-up with the url http://companywebsite.co.uk when I tried to do a Moz manual crawl using this URL I got an error message. I changed the link to crawl to http://www.companywebsite.co.uk and the crawl went off without a hitch and im currently waiting on the results. From testing I now know that if i go to the non-www version of my companies website then nothing happens it never loads. But if I go to the www version then it loads right away. I know for SEO you only want 1 of these URLS so you dont have duplicate content. But i thought the non-www should redirect to the www version. Not just be completely missing. I tried to set-up a new campaign with the defaults URL being the www version but Moz automatically changed it to the non-www version. It seems a cannot set up a new campaign with it automatically crawling the www version. Does it sound like im out the right path to finding this cause? Or can somebody else offer up a solution? Many thanks,
Technical SEO | | ATP
Ben .0 -
Salvaging links from WMT “Crawl Errors” list?
When someone links to your website, but makes a typo while doing it, those broken inbound links will show up in Google Webmaster Tools in the Crawl Errors section as “Not Found”. Often they are easy to salvage by just adding a 301 redirect in the htaccess file. But sometimes the typo is really weird, or the link source looks a little scary, and that's what I need your help with. First, let's look at the weird typo problem. If it is something easy, like they just lost the last part of the URL, ( such as www.mydomain.com/pagenam ) then I fix it in htaccess this way: RewriteCond %{HTTP_HOST} ^mydomain.com$ [OR] RewriteCond %{HTTP_HOST} ^www.mydomain.com$ RewriteRule ^pagenam$ "http://www.mydomain.com/pagename.html" [R=301,L] But what about when the last part of the URL is really screwed up? Especially with non-text characters, like these: www.mydomain.com/pagename1.htmlsale www.mydomain.com/pagename2.htmlhttp:// www.mydomain.com/pagename3.html" www.mydomain.com/pagename4.html/ How is the htaccess Rewrite Rule typed up to send these oddballs to individual pages they were supposed to go to without the typo? Second, is there a quick and easy method or tool to tell us if a linking domain is good or spammy? I have incoming broken links from sites like these: www.webutation.net titlesaurus.com www.webstatsdomain.com www.ericksontribune.com www.addondashboard.com search.wiki.gov.cn www.mixeet.com dinasdesignsgraphics.com Your help is greatly appreciated. Thanks! Greg
Technical SEO | | GregB1230 -
HTTP 500 Internal Server Error, Need help
Hi, For a few days know google crawlers have been getting 500 errors from our dedicated server whenever they try to crawl the site. Using the "Fetch as Google" tool under health in webmaster tools, I get "Unreachable page" every time I fetch the homepage. Here is exactly what the google crawler is getting: <code>HTTP/1.1 500 Internal Server Error Date: Fri, 21 Jun 2013 19:52:27 GMT Server: Apache/2.2.15 (CentOS) X-Powered-By: PHP/5.3.3 X-Pingback: [http://www.communityadvocate.com/xmlrpc.php](http://www.communityadvocate.com/xmlrpc.php) Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> My url is [http://www.communityadvocate.com](http://www.communityadvocate.com/)</code> and here's the screenshot from Goolge webmater http://screencast.com/t/FoWvqRRtmoEQ How can i fix that? Thank you
Technical SEO | | Vmezoz0 -
Odd URL errors upon crawl
Hi, I see this in Google Webmasters, and am now also seeing it here...when a crawl is performed on my site, I get many 500 server error codes for URLs that I don't believe exist. It's as if it sees a normal URL but adds this to it: %3Cdiv%20id= It's like this for hundreds of URLs. Good URL that actually exists http://www.ffr-dsi.com/food-retailing/supplies/ URL that causes error and I have no idea why http://www.ffr-dsi.com/food-retailing/supplies/%3Cdiv%20id= Thanks!
Technical SEO | | Matt10 -
Help needed please with 301 redirects in htaccess file.
In summary, we're currently having issues with our htaccess file. 301 redirects are going through to the new described URL but in addition the new URL is followed by a ? and the old URL. How can we get rid of the ? and previous URL so they don't appear as an ending. None of the examples we've found re this issue online appear to work. Can anyone please offer some advice? Can we use a RewriteRule to stop this happening? Here's a summary of the htaccess file REDIRECT CODE BEGINS HERE LONG LIST OF REDIRECTS, which appear to be set up perfectly fine. REDIRECT CODE ENDS DirectoryIndex index.php <ifmodule mod_rewrite.c="">RewriteEngine On Options +FollowSymLinks
Technical SEO | | petersommertravels
DirectoryIndex index.php
RewriteEngine On
RewriteCond $1 !^(images|system|themes|pdf|favicon.ico|robots.txt|index.php) [NC]
RewriteRule ^.htaccess$ - [F]
RewriteRule ^favicon.ico - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?/$1 [L]</ifmodule> DirectoryIndex index.php0