Strange Webmaster Tools Crawl Report
-
Up until recently I had robots.txt blocking the indexing of my pdf files which are all manuals for products we sell. I changed this last week to allow indexing of those files and now my webmaster tools crawl report is listing all my pdfs as not founds.
What is really strange is that Webmaster Tools is listing an incorrect link structure: "domain.com/file.pdf" instead of "domain.com/manuals/file.pdf"
Why is google indexing these particular pages incorrectly? My robots.txt has nothing else in it besides a disallow for an entirely different folder on my server and my htaccess is not redirecting anything in regards to my manuals folder either. Even in the case of outside links present in the crawl report supposedly linking to this 404 file when I visit these 3rd party pages they have the correct link structure.
Hope someone can help because right now my not founds are up in the 500s and that can't be good
Thanks is advance!
-
Hello,
Did you check the "linked From" tab? click on each error and see which are the sites that are linked from
-
Thanks for the help Wissam!
What I have done is changed all relative paths to direct- then I ran screaming frog and it did not pick up any 404s at all - this was last Thursday. Unfortunately webmaster tools is still reporting the same style 404s having been discovered since then. Is there a reason why screaming frog and webmaster tools would be seeing different crawl results?
-
all link reported in the GWT is based on a crawl.( so there is either an external or internal link pointing to these.com/file.pdf)
So what i would do is fire up Screaming Frog or Xenu and do a full site crawl and check the reports. You might find some pages linking or using relative urls in the a href elements.
If you land into a situation where you have external links pointing to wrong URLS I would recommend either by contacting them or just 301 /file.pdf to /manuals/file.pdf
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Bing Webmaster Shows Domain without WWW
One of our sites shows thousands of 301 redirects due to domain without www in Bing Webmaster under crawl Information page. It’s been like this for a long time. None of the internal pages have domain without www, it was tested through Screaming Frog. We do have www preference set in google webmaster, but unfortunately bing doesn’t have this option. We also specify URL with www preference through structural data, but that still doesn’t help. Did anyone have similar problems with Bing, and how did you resolve it?
Technical SEO | | rkdc1 -
Crawl depth and www
I've run a crawl on a popular amphibian based tool, just wanted to confirm... should http://www.homepage be at crawl depth 0 or 1? The audit shows http://homepage at level 0 and http://www.homepage at level 1 through a redirect. Thanks
Technical SEO | | Focus-Online-Management0 -
How google crawls images and which url shows as source?
Hi, I noticed that some websites host their images to a different url than the one their actually website is hosted but in the end google link to the one that the site is hosted. Here is an example: This is a page of a hotel in booking.com: http://www.booking.com/hotel/us/harrah-s-caesars-palace.en-gb.html When I try a search for this hotel in google images it shows up one of the images of the slideshow. When I click on the image on Google search, if I choose the Visit Page button it links to the url above but the actual image is located in a totally different url: http://r-ec.bstatic.com/images/hotel/840x460/135/13526198.jpg My question is can you host your images to one site but show it to another site and in the end google will lead to the second one?
Technical SEO | | Tz_Seo0 -
Google not crawling the website from 22nd October
Hi, This is Suresh. I made changes to my website and I see that google is unable to crawl my website from 22nd October. Even it is not showing any content when I use Cache:www.vonexpy.com. Can any body help me in knowing why Google is unable to crawl my website. Is there any technical issue with the website? Website is www.vonexpy.com Thanks in advance.
Technical SEO | | sureshchowdary1 -
On page report Confusing Whats wrong?
Hi Guys i am getting really confusing messages from the on page report in seo moz, i am running one of my customers urls, and testing it against a keyword "corporate catering " its coming up with an A but there is no meta data in the site or any keywords on the page regarding this, http://www.georgieporgies.co.uk/corporate-catering corporate catering it should not be getting an A but it does, i just checked a few other pages and its the same story whats going on, what am i missing here thanks will
Technical SEO | | Will_Craig0 -
Error Reporting
http://pro.seomoz.org/campaigns/33868/issues/18 Rel Canonical Found about 16 hours ago <dl> <dt>Tag value</dt> <dd>http://www.geeks.com/</dd> <dt>Description</dt> <dd>Using rel=canonical suggests to search engines which URL should be seen as canonical.</dd> <dd>We do have rel canonical on some of the pages this report is recommending that we "fix" this issue.</dd> <dd> Rel Canonical Found about 16 hours ago <dl> <dt>Tag value</dt> <dd>http://www.geeks.com/products.asp?cat=MBB</dd> <dt>Description</dt> <dd>Using rel=canonical suggests to search engines which URL should be seen as canonical.</dd> </dl> <a class="more expanded">Minimize</a> </dd> </dl>
Technical SEO | | JustinGeeks0 -
Why am i receiving two different speed reports for my site
Hi i am a bit puzzled. i am optimizing my site to speed it up but i am getting two different speed reports. my site is www.in2town.co.uk and the two speed websites i am using are http://tools.pingdom.com and http://gtmetrix.com Can anyone please let me know what results they are getting for my site with the two above tools and also why i am getting two different results. the other day before i made some changes, i was getting on average 2 seconds for loading speed, but at the moment, sometimes i am getting seven seconds and then when i test it again it goes down to three seconds or below. Also, does anyone know how i can test to make sure that my site is faster when you go from one page to another on my site. I am not getting the true loading experience on my site from one page to another because good old virgin are updating their broadband in our area and we have also had voltage problem, whatever this means on our broadband box outside on our street which they claim they have sorted. any help would be great
Technical SEO | | ClaireH-1848860 -
Keyword Difficulty Tool
Hi Mozzers! Randfishkin just posted yesterday a very nice important and helpfull post, about keyword difficulty. I will be happy, if you can write here the metrics from reports of keyword difficulty, to know more about position of our website on SERP, and to know more what to engage if someone is ranking higher than me, with same metrics of the report of keyword difficulty. It would be very nice, if we talk on this topic here about keyword difficulty how to's. Thanks
Technical SEO | | leadsprofi0