404s in Google Search Console and javascript
-
The end of April, we made the switch from http to https and I was prepared for a surge in crawl errors while Google sorted out our site. However, I wasn't prepared for the surge in impossibly incorrect URLs and partial URLs that I've seen since then.
I have learned that as Googlebot grows up, he'she's now attempting to read more javascript and will occasionally try to parse out and "read" a URL in a string of javascript code where no URL is actually present. So, I've "marked as fixed" hundreds of bits like
/TRo39,
category/cig
etc., etc....But they are also returning hundreds of otherwise correct URLs with a .html extension when our CMS system generates URLs with a .uts extension like this:
https://www.thompsoncigar.com/thumbnail/CIGARS/90-RATED-CIGARS/FULL-CIGARS/9012/c/9007/pc/8335.html
when it should be:
https://www.thompsoncigar.com/thumbnail/CIGARS/90-RATED-CIGARS/FULL-CIGARS/9012/c/9007/pc/8335.utsWorst of all, when I look at them in GSC and check the "linked from" tab it shows they are linked from themselves, so I can't backtrack and find a common source of the error.
Is anyone else experiencing this? Got any suggestions on how to stop it from happening in the future? Last month it was 50 URLs, this month 150, so I can't keep creating redirects and hoping it goes away.
Thanks for any and all suggestions!
Liz Micik -
Hi Liz,
What I would do as well is go with a solution around your robots.txt to make sure the crawlers will respect it and don't go on a hunch trying to find new URLs that are embedded somewhere else. Usually it's something you shouldn't worry about too much, it's just the crawler doing a good job trying to find more content/URLs on your site.
Martijn.
-
Google Search Console 404s can be very irritating.
What does your robots.txt file currently say?
You can remove all instances of .html in your htaccess file to help solve for this issue. More on that here: https://alexcican.com/post/how-to-remove-php-html-htm-extensions-with-htaccess/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
A company claiming to have a proprietary software that replicates Google algorithm?
Hi all, Unfortunately, getting into a bit of a p*ssing match 😞 with a company trying to compete for the business of one of our clients and just wanted to some feedback from the community here. The company competing for the client's business claims to have spent $1 million to replicate Google's algorithm so they create a replica site (not sure I understand this) of the client site, then test and optimize on-page SEO changes in their software to determine whether the on-page changes are ideal. Sounds fishy to me. Thoughts?
Algorithm Updates | | RickyShockley0 -
Google October 2015 Algorithm Update?
According to Accuranker (https://www.accuranker.com/blog/google-october-2015-algorithm-update/), "Google has made some big changes to their algorithm". Other than that one article, I haven't noticed or even heard of any considerable fluctuations. Even Mozcast is looking pretty normal today. Has anyone noticed anything or have any other sources on this? If so, any ideas on what this update seems to be targeting?
Algorithm Updates | | Silkstream0 -
Google keyword tool
I was quite happy with google keyword tool for basic and accurate searches for keywords. Can anyone suggests a new tool that will give accurate search volume on google ( country specific ) I am not interest in info for adwords, and find a keyword planner tool way out in traffic results, compared to Keyword tool. Is the keyword tool completely gone?
Algorithm Updates | | summer3000 -
Struggling with Google Bot Blocks - Please help!
I own a site called www.wheretobuybeauty.com.au After months and months we still have a serious issue with all pages having blocked URLs according to Google Webmaster Tools. The 404 errors are returning a 200 header code according to the email below. Do you agree that the 404.php code should be changed? Can you do that please ? The current state: Google webmaster tools Index Status shows: 26,000 pages indexed 44,000 pages blocked by robots. In late March, we implemented a change recommended by an SEO expert and he provided a new robots.txt file, advised that we should amend sitemap.xml and other changes. We implemented those changes and then setup a re-index of the site by google. The no of blocked URLs eventually reduced in May and June to 1,000 for a few days – but now the problem has rapidly returned. The no of pages that are displayed in a google search request of www.google.com.au where the query was ‘site:wheretobuybeauty.com.au’ is 37,000: This new site has been re-crawled over last 4 weeks. About the site This is a Linux php site and has the following: 55,000 URLs in sitemap.xml submitted successfully to webmaster tools robots.txt file has been modified several times: Firstly we had none Then we created one but were advised that it needed to have this current content: User-agent: * Disallow: Sitemap: http://www.wheretobuybeauty.com.au/sitemap.xml
Algorithm Updates | | socialgrowth0 -
Would Google Remove Pages for Inactivity?
Hi, I've been watching the Total Indexed number for 4 domains that I work with for the last few months. In Google Webmaster Tools three of them were holding steady up until August-September, when suddenly they started declining by hundreds of thousands of URLs a week. I've asked my IT department and they say they haven't done anything technically different in the last few months that would affect indexation. I've also searched on google and on search marketing blogs to see if anyone else has experience this to no avail. As you can see in the image, the "Not Selected" pages have not increased so it appears this is not due to duplicate content (of which we have a lot). However, the "Ever Crawled" number is increasing. The only reasonable answer that I can conclude is that Google is now de-indexing inactive URLs? Anyone have a better answer? yIYDm.jpg
Algorithm Updates | | OfficeFurn0 -
7 Pack Google Serps?
What is the best way to get into the 7 pack of google serps? I have a site that ranked well before this changed but not was pushed back to page 2. I have Unique content and I currently have provided my info to all the standard local sites, like Yelp, Manta, Local.com and others. I already have a Google Local page and I also have links from local sites. What else can be done?
Algorithm Updates | | bronxpad0 -
Google Multiple Results
With Google's penchant for listing at times many results - one on top of the other - from the same domain, is it now advisable to not worry about having multiple pages in the same site targeting the same or very similar keywords? Is this (keyword/page internal competition) one less thing that I have to worry about or worry about less or what? Thanks! Best... Jane
Algorithm Updates | | 945010 -
Search bots that use referrers?
Can someone point me to a list or just tell me specific search bots that use referrers?
Algorithm Updates | | BostonWright0