404s in GWT - Not sure how they are being found
-
We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart.
The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work.
I can't figure out how Google is finding these pages. Could it be removing the query string?
Thanks.
-
Kelli - the first thing I thought was what garfield_disliker asks: have you set up Google Webmaster Tools to ignore these parameters that are important for the cart page to load?
That said, Google Webmaster Tools is run by a team that's separate from the primary search team, so it's possible that GWT is flagging an issue that isn't an actual issue for Google. Run a search in Google for "site:yourdomain.com/UpdateCart" and see what URLs Google has indexed. If they have that 404ing URL, that's not good. If they have correct URLs, it's possible that this is a Google Webmaster Tools thing.
-
Hi,
Are you using the /updateCart url in goal tracking or pushing events to analytics using this url? I have seen GWT pick up 404's from us pushing virtual (non existing) page views to analytics for goal tracking etc. Just a thought.
-
First, you can never be sure there are no external links. Open Site Explorer's index (and any other link analysis tool) is not a full picture, and Google doesn't always provide all the inbound links to your site. The junkier the scraper, the less likely you will see the link.
Secondly, could you provide a concrete example of this?
Where is the page (with parameters) linked from/to on your site? How is your site appending those parameters to the URL? Does it send users through a redirect to get to that URL? It might be useful to run your own crawl (w/ Screaming Frog or any other crawling software) of the site and take a look at all the internal links and the response codes.
Also have you set up Google WMT to ignore any parameters?
It's certainly possible that Google's crawlers are stripping parameters on their own.
-
We do not dynamically inject canonicals into the page. They are also not old URLs because they have never been valid URLs.
They are all linked from internal pages, but when I look at those pages, the URL with the query string is the only URL that is being pointed to, not the partial URL. There are no external links.
Thanks,
Kelli -
In WMT click on the URL that is 404'd and then select "linked to from". It will show you where Google is picking up the 404 error.
Are these 404 pages being linked to from an external site? Sometimes the 404s that appear in WMT are from links pointing to your domain from an external site, often one that has scraped your site.
-
Does your website dynamically inject canonical links into the page? Some content management systems will automatically generate canonicals that strip parameters from the URL. If that's the case then that might be why you wouldn't see it in your ordinary site structure.
It's also possible that it's an old URL that Google indexed which is no longer on your site or something that is linked externally somewhere, so the crawlers are finding it somewhere off site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Creating a help hub, not sure the best name to use, " keyword help " or " help hub "?
I've been creating new content for our site, lots of help related content, so I created a help hub section. Now the more I go through it, and look at url structure and breadcrumbs, I can't help but think I should be using a keyword in there, but also don't want to over do it, since the keyword we are shooting for is also a subsection of our site, complete with url keyword and breadcrumb. So I just don't want to have too many over redundant titles like keyword this and keyword that, so I came here to get some advice from the awesome community of folks. Keep help hub so it's: Url: site.com/help-hub/helppage1 Breadcrumb: Home > Help-Hub > Help Page 1 or Url: site.com/keyword/help/helppage1 Breadcrumb: Home > Keyword > Help > Help Page 1
Technical SEO | | Deacyde0 -
404s still showing in GWT
Hi, My client recently undertook a site migration. Since the new site's gone live GWT has highlighted over 2000 not found errors. These were fixed nearly 2 weeks ago and they're still being listed in GWT. Do I have to wait for Google to re-crawl the page before they're removed from the list? Or do I need to go through the list, individually check them and mark them as fixed? Any help would be appreciated. Thanks
Technical SEO | | ChannelDigital0 -
Why would GWT say 0 pages indexed ?
Hi Looking in GWT > Google Index > Index Status says 0 pages indexed Yes if i search manually on google for brand site is listed, and i see organic traffic from Google in analytics I take it this is likely an error in GWT and nothing to worry about ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
GWT Change of Address Keeps Failing
Followed Google's instructions for using the Change of Address Tool in GWT to move rethinkisrael.org to www.fromthegrapevine.com. I'm getting this message, "We tried to reconfirm ownership of your old site (rethinkisrael.org) and failed. Make sure your verification token is present, and try again."Even though the site is verified, we undid the DNS change, and checked the meta verification tag. The tag is correct. And, since the site is ALREADY verified there was NO way to 'veryify' in GWT again. The message in GWT says "verification successful."We redid the DNS change, tried again to do the address change and get the same error message. Any ideas?
Technical SEO | | Aggie0 -
Help Understanding GWT Message
Brief background: A few months ago, our firm exchanged blog posts with another law firm in Pennsylvania with followed links. Though we did exchange links, the posts weren't spammy. They wrote "A Floridian's Guide To A Car Accident In Pennyslvania" and we wrote one for Pennsylvanians in Florida. (The reason for this is that Personal Injury law varies drastically from state-to-state, and Florida has a ton of people who move back and forth). My question: His firm got a message from google saying our link to him violated googles' guidelines. I went and removed the link, BUT I didn't get any message saying his link to our site was a violation. Shouldn't we both have gotten messages? Perhaps, mine is "in the mail" so to speak, but I would think both would go out at the same time, so I'm wondering if there is another possible reason? Thanks, Ruben
Technical SEO | | KempRugeLawGroup0 -
4xx error - but no broken links founded by Xenu
In my SeoMoz crawl report I get multiple 4XX errors reported and they are all on the same type of links. www.zylom.com/nl/help/contact/9/ and differiate between the number at the end and the language. But I i look in the source code we nice said: <a class="<a class="attribute-value">bigbuttonblue</a>" style="<a class="attribute-value">float:right; margin-left:10px;</a>" href="[/nl/help/contact/9/?sid=9&e=login](view-source:http://www.zylom.com/nl/help/contact/9/?sid=9&e=login)" onfocus="<a class="attribute-value">blur()</a>" title="<a class="attribute-value">contact</a>"> contact a> I already tested the little helpfull tool Xenu, but this also doesn't give any broken links for the url's which I found in the 4xx error report. Could somebody give me a suggestion Why these 4xx errors keep coming? Could it be that the SeoMoz crawlers break the part ?sid=9&e=login' from the URL. Because if you want to enter the link, you first get a pop-up to fill in a login screen. Thanks for you answers already
Technical SEO | | Letty0 -
Top pages give " page not found"
A lot of my top pages point to images in a gallery on my site. When I click on the url under the name of the jpg file I get an error page not found. For instance this link: http://www.fastingfotografie.nl/architectuur-landschap/single-gallery/10162327 Is this a problem? Thanks. Thomas. JkLej.png
Technical SEO | | thomasfasting0 -
Why is there such a big discrepancy between OSE and GWT regarding # backlinks?
Hello, We have been doing some analysis around our backlink profiles for our sites and have been experiencing a massive discrepancy between what is reported as number of C class linking domains in OSE and the information returned in Google Webmaster tools. For a variety of sites OSE is reporting numbers < 10 for C class linking doamins while GWT shows >100 unique domains linking (we confirmed that the majority of these links are in different C classes) Is this simply a matter of the limited index size of OSE or could there be another explanation? It is interesting that the links that do show up in OSE a nearly exclusively sites that we own. /T
Technical SEO | | tomypro0