404s in GWT - Not sure how they are being found

Colbys

We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart.

The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work.

I can't figure out how Google is finding these pages. Could it be removing the query string?

Thanks.

KristinaKledzik

Kelli - the first thing I thought was what garfield_disliker asks: have you set up Google Webmaster Tools to ignore these parameters that are important for the cart page to load?

That said, Google Webmaster Tools is run by a team that's separate from the primary search team, so it's possible that GWT is flagging an issue that isn't an actual issue for Google. Run a search in Google for "site:yourdomain.com/UpdateCart" and see what URLs Google has indexed. If they have that 404ing URL, that's not good. If they have correct URLs, it's possible that this is a Google Webmaster Tools thing.

LynnPatchett

Hi,

Are you using the /updateCart url in goal tracking or pushing events to analytics using this url? I have seen GWT pick up 404's from us pushing virtual (non existing) page views to analytics for goal tracking etc. Just a thought.

garfield_disliker

First, you can never be sure there are no external links. Open Site Explorer's index (and any other link analysis tool) is not a full picture, and Google doesn't always provide all the inbound links to your site. The junkier the scraper, the less likely you will see the link.

Secondly, could you provide a concrete example of this?

Where is the page (with parameters) linked from/to on your site? How is your site appending those parameters to the URL? Does it send users through a redirect to get to that URL? It might be useful to run your own crawl (w/ Screaming Frog or any other crawling software) of the site and take a look at all the internal links and the response codes.

Also have you set up Google WMT to ignore any parameters?

It's certainly possible that Google's crawlers are stripping parameters on their own.

Colbys

We do not dynamically inject canonicals into the page. They are also not old URLs because they have never been valid URLs.

They are all linked from internal pages, but when I look at those pages, the URL with the query string is the only URL that is being pointed to, not the partial URL. There are no external links.

Thanks,
Kelli

Schwaab

In WMT click on the URL that is 404'd and then select "linked to from". It will show you where Google is picking up the 404 error.

Are these 404 pages being linked to from an external site? Sometimes the 404s that appear in WMT are from links pointing to your domain from an external site, often one that has scraped your site.

garfield_disliker

Does your website dynamically inject canonical links into the page? Some content management systems will automatically generate canonicals that strip parameters from the URL. If that's the case then that might be why you wouldn't see it in your ordinary site structure.

It's also possible that it's an old URL that Google indexed which is no longer on your site or something that is linked externally somewhere, so the crawlers are finding it somewhere off site.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

404s in GWT - Not sure how they are being found

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is there a way to set up 301 auto redirects from 404s

Fetching & Rendering a non ranking page in GWT to look for issues

Why is there a difference in the number of indexed pages shown by GWT and site: search?

404s effecting crawl rate?

GWT Change of Address Keeps Failing

When choosing GWT preferred domain its asking for re-verification?

Removing a staging area/dev area thats been indexed via GWT (since wasnt hidden) from the index

Not sure which URL to use for 301 redirect