Strange Webmaster Tools Crawl Report
-
Up until recently I had robots.txt blocking the indexing of my pdf files which are all manuals for products we sell. I changed this last week to allow indexing of those files and now my webmaster tools crawl report is listing all my pdfs as not founds.
What is really strange is that Webmaster Tools is listing an incorrect link structure: "domain.com/file.pdf" instead of "domain.com/manuals/file.pdf"
Why is google indexing these particular pages incorrectly? My robots.txt has nothing else in it besides a disallow for an entirely different folder on my server and my htaccess is not redirecting anything in regards to my manuals folder either. Even in the case of outside links present in the crawl report supposedly linking to this 404 file when I visit these 3rd party pages they have the correct link structure.
Hope someone can help because right now my not founds are up in the 500s and that can't be good
Thanks is advance!
-
Hello,
Did you check the "linked From" tab? click on each error and see which are the sites that are linked from
-
Thanks for the help Wissam!
What I have done is changed all relative paths to direct- then I ran screaming frog and it did not pick up any 404s at all - this was last Thursday. Unfortunately webmaster tools is still reporting the same style 404s having been discovered since then. Is there a reason why screaming frog and webmaster tools would be seeing different crawl results?
-
all link reported in the GWT is based on a crawl.( so there is either an external or internal link pointing to these.com/file.pdf)
So what i would do is fire up Screaming Frog or Xenu and do a full site crawl and check the reports. You might find some pages linking or using relative urls in the a href elements.
If you land into a situation where you have external links pointing to wrong URLS I would recommend either by contacting them or just 301 /file.pdf to /manuals/file.pdf
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Web Master Tools: change of address
Hello, Hope you can help with an issue I'm having:www.brand-kw1-kw2.com (eg name of course) has moved to www.brand.comAll settings are done - like I did before when moving and setting up a new domain for an older site. (30q redirects - all versions are verified in WMT).However, when the change of adress features is "called out" I get the message: "We couldn't verify www.brand-kw1-kw2.com To submit a change of address, www.brand-kw1-kw2.com must be verified using the same method as brand-kw1-kw2.com. Add www.brand-kw1-kw2.com to your account and verify ownership, then try again." So dose it basically saying that I have to use the same method to verify both the www and non-www version ? Dose it make sense ? It sounds silly. Again, all version are verified and visibile in WMT already - I don't know how in the past those verifications were done but everything is looking good now. Is here a fix for this issue ? (I've moved several but this is the first issue I've encountered) Many thanks.
Technical SEO | | eyepaq0 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
Site Crawl
I was wondering if there was a way to use SEOmoz's tool to quickly and easily find all the URLs on you site and not just the ones with errors. The site that I am working on does not have a site map. What I am trying to do is find all the URLs along with their titles and description tags. Thank you very much for your help
Technical SEO | | pakevin0 -
On page report Confusing Whats wrong?
Hi Guys i am getting really confusing messages from the on page report in seo moz, i am running one of my customers urls, and testing it against a keyword "corporate catering " its coming up with an A but there is no meta data in the site or any keywords on the page regarding this, http://www.georgieporgies.co.uk/corporate-catering corporate catering it should not be getting an A but it does, i just checked a few other pages and its the same story whats going on, what am i missing here thanks will
Technical SEO | | Will_Craig0 -
How to stop Search Bot from crawling through a submit button
On our website http://www.thefutureminders.com/, we have three form fields that have three pull downs for Month, Day, and year. This is creating duplicate pages while indexing. How do we tell the search Bot to index the page but not crawl through the submit button? Thanks Naren
Technical SEO | | NarenBansal0 -
Strange Top URLs for Keywords in Google Webmaster Tools
When we click on one of our keywords under the keywords section of Google Webmaster Tools it shows our top URLs for that keyword. The problem is that it is giving us some very strange URLs that we have searched high and low to try to find but we don't know where they came from. Here is a screenshot: http://bit.ly/pl6mB3 Do you know where this type of URL string could have originated and how to fix it?
Technical SEO | | Hakkasan0 -
Having some weird crawl issues in Google Webmaster Tools
I am having a large amount of errors in the not found section that are linked to old urls that haven't been used for 4 years. Some of the ulrs being linked to are not even in the structure that we used to use for urls. Never the less Google is saying they are now 404ing and there are hundreds of them. I know the best way to attack this is to 301 them, but I was wondering why all of these errors would be popping up. I cant find anything in the google index searching for the link in "" and in webmaster tools it shows unavailable as where these are being linked to from. Any help would be awesome!
Technical SEO | | Gordian1 -
How can I get Google to crawl my site daily?
I was wndering if there was a trick to getting google to crawl my website daily?
Technical SEO | | labradoodlelocator0