404s in GWT - Not sure how they are being found
-
We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart.
The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work.
I can't figure out how Google is finding these pages. Could it be removing the query string?
Thanks.
-
Kelli - the first thing I thought was what garfield_disliker asks: have you set up Google Webmaster Tools to ignore these parameters that are important for the cart page to load?
That said, Google Webmaster Tools is run by a team that's separate from the primary search team, so it's possible that GWT is flagging an issue that isn't an actual issue for Google. Run a search in Google for "site:yourdomain.com/UpdateCart" and see what URLs Google has indexed. If they have that 404ing URL, that's not good. If they have correct URLs, it's possible that this is a Google Webmaster Tools thing.
-
Hi,
Are you using the /updateCart url in goal tracking or pushing events to analytics using this url? I have seen GWT pick up 404's from us pushing virtual (non existing) page views to analytics for goal tracking etc. Just a thought.
-
First, you can never be sure there are no external links. Open Site Explorer's index (and any other link analysis tool) is not a full picture, and Google doesn't always provide all the inbound links to your site. The junkier the scraper, the less likely you will see the link.
Secondly, could you provide a concrete example of this?
Where is the page (with parameters) linked from/to on your site? How is your site appending those parameters to the URL? Does it send users through a redirect to get to that URL? It might be useful to run your own crawl (w/ Screaming Frog or any other crawling software) of the site and take a look at all the internal links and the response codes.
Also have you set up Google WMT to ignore any parameters?
It's certainly possible that Google's crawlers are stripping parameters on their own.
-
We do not dynamically inject canonicals into the page. They are also not old URLs because they have never been valid URLs.
They are all linked from internal pages, but when I look at those pages, the URL with the query string is the only URL that is being pointed to, not the partial URL. There are no external links.
Thanks,
Kelli -
In WMT click on the URL that is 404'd and then select "linked to from". It will show you where Google is picking up the 404 error.
Are these 404 pages being linked to from an external site? Sometimes the 404s that appear in WMT are from links pointing to your domain from an external site, often one that has scraped your site.
-
Does your website dynamically inject canonical links into the page? Some content management systems will automatically generate canonicals that strip parameters from the URL. If that's the case then that might be why you wouldn't see it in your ordinary site structure.
It's also possible that it's an old URL that Google indexed which is no longer on your site or something that is linked externally somewhere, so the crawlers are finding it somewhere off site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are there ways to avoid false positive "soft 404s" by Google
Sometimes I get alerts from Google Search Console that it has detected soft 404s on different websites, and since I take great care to never have true soft 404s, they are always false positives. Today I got one on a website that has pages promoting some events. The language on the page for one event that has sold out says that "tickets are no longer available" which seems to have tripped up Google into thinking the page is a soft 404. It's kind of incredible to me that in the current era we're in, with things like chatGPT that Google doesn't seem to understand natural language. But that has me thinking, are there some strategies or best practices we can use in how we write copy on the page so Google doesn't flag it as soft 404? It seems like anything that could tell a user that an item isn't available could trip it up into thinking it is a 404. In the case of my page, it's actually important information we need to tell the public that an event has sold out, but to use their interest in that event to promote other events. so I don't want the page deindexed or not to rank well!
Technical SEO | | IrvCo_Interactive0 -
Is there a way to set up 301 auto redirects from 404s
some of our pages under a specific website section gets deleted from another data source and we want to resolve the problem of 404s can we set up automated 301 redirects to the main page as soon as one of these pages are deleted
Technical SEO | | lina_digital2 -
Technical guide for Setting up a CDN to host our images, as well as creating an image sitemap, and setting up the CDN in GWT?
Hi All! We're thinking of setting up a CDN to host our images with a CNAME on a subdomain of our site. In terms of SEO, I was wondering if any of you knew of a pretty complete technical guide for setting it all up. Including whether or not we need to create an image sitemap, and setting it up in GWT. Thanks in advance! Vince
Technical SEO | | jbrisebois0 -
GWT - International Targeting
By selecting a country in the Country Targeting section of GWT what effect does this have? For example if I select UK will this boost rankings on google.co.uk and decrease them on google.com etc? If we are based in the UK but our customer base is worldwide should we not select anything?
Technical SEO | | twitime0 -
Are thousands of 404s a problem?
An ecommerce site I work on has around 16,000 URLs that are 404s in Webmaster Tools. The vast majority are for products that are no longer stocked by the site, which is a natural occurrence in ecommerce. But my question is, could these possibly be harming rankings?
Technical SEO | | creativemay1 -
Lost with conical, nofollow noindex. Not sure how to use it on a dyanmic php site with multiple region select options
I have a site with multiple regions the main page after a region is selected is login.php but the regions are defined by ?rid=11 , 12, etc. These are being picked up as duplicate content but they are all different regions. As i hired external php coders to develop most of the site I am scared to start meddling with any of the raw code and would like some advise on how to not show these as duplicate content. should i use noindex nofollow or connical? if Connical how do i set it up on the main login.php page? p.s. i am an extreme nube to seo
Technical SEO | | moby1230 -
Fix or Block Webmaster Tools URL Errors Not Found Linked from a certain domain?
RE: Webmaster Tool "Not Found" URL Errors are strange links from webstatsdomain.com Should I continue to fix 404 errors for strange links from a website called webstatsdomain.com or is there a way to ask Google Webmaster Tools to ignore them? Most of Webmaster Tools "URL Not Found errors" I find for our website are from this domain. They refer to pages that never existed. For example, one was to www.mydomain.com/virtual. Thanks for your help.
Technical SEO | | zharriet0 -
Dramatic Decrease in Google Organic Traffic Indicates a Penalty But None Found
So we've been having some difficulty with one of our websites since we split it in half and moved one section of content to a new domain with a new name, at the end of May. http://www.dialtosave.co.uk/mobile/ was moved to http://www.somobile.co.uk And in the following 6 weeks, the google organic traffic has fallen to miniscule levels, that seem to indicate a more serious issue than just low ranking. Initially when the site was moved, the 301s transferred the authority very quickly and the new website pages ranked well. Now, some of them simply won't rank at all unless you include the name of the website "somobile". Here is one of the current rankings that indicates an issue:
Technical SEO | | purpleindigo
"somobile" - 1
"somobile mobile phones" - not in top 50 These are some of the terms we used to rank in the top 10 on Google UK, and still do on Bing UK, but don't rank in the top 50 on Google UK now:
samsung galaxy ace
apple iphone 5 deals
samsung tocco icon Our webmaster central account says that only 30% of the pages in our sitemap are in the index. It seems like a penalty has been imposed, but our reconsideration request (just submitted because it seemed like a sensible next step) came back saying there were no manual actions taken. Can you see what it is that might be causing the problem for us? I would have thought it was the new domain (with less direct links and less brand credibility), or content issues, but I would have thought that would just reduce the ranking by a few pages rather than just hide the pages altogether.0