404s in GWT - Not sure how they are being found
-
We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart.
The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work.
I can't figure out how Google is finding these pages. Could it be removing the query string?
Thanks.
-
Kelli - the first thing I thought was what garfield_disliker asks: have you set up Google Webmaster Tools to ignore these parameters that are important for the cart page to load?
That said, Google Webmaster Tools is run by a team that's separate from the primary search team, so it's possible that GWT is flagging an issue that isn't an actual issue for Google. Run a search in Google for "site:yourdomain.com/UpdateCart" and see what URLs Google has indexed. If they have that 404ing URL, that's not good. If they have correct URLs, it's possible that this is a Google Webmaster Tools thing.
-
Hi,
Are you using the /updateCart url in goal tracking or pushing events to analytics using this url? I have seen GWT pick up 404's from us pushing virtual (non existing) page views to analytics for goal tracking etc. Just a thought.
-
First, you can never be sure there are no external links. Open Site Explorer's index (and any other link analysis tool) is not a full picture, and Google doesn't always provide all the inbound links to your site. The junkier the scraper, the less likely you will see the link.
Secondly, could you provide a concrete example of this?
Where is the page (with parameters) linked from/to on your site? How is your site appending those parameters to the URL? Does it send users through a redirect to get to that URL? It might be useful to run your own crawl (w/ Screaming Frog or any other crawling software) of the site and take a look at all the internal links and the response codes.
Also have you set up Google WMT to ignore any parameters?
It's certainly possible that Google's crawlers are stripping parameters on their own.
-
We do not dynamically inject canonicals into the page. They are also not old URLs because they have never been valid URLs.
They are all linked from internal pages, but when I look at those pages, the URL with the query string is the only URL that is being pointed to, not the partial URL. There are no external links.
Thanks,
Kelli -
In WMT click on the URL that is 404'd and then select "linked to from". It will show you where Google is picking up the 404 error.
Are these 404 pages being linked to from an external site? Sometimes the 404s that appear in WMT are from links pointing to your domain from an external site, often one that has scraped your site.
-
Does your website dynamically inject canonical links into the page? Some content management systems will automatically generate canonicals that strip parameters from the URL. If that's the case then that might be why you wouldn't see it in your ordinary site structure.
It's also possible that it's an old URL that Google indexed which is no longer on your site or something that is linked externally somewhere, so the crawlers are finding it somewhere off site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can I make sure a desktoppage is shown in the (desktop) search results instead of the mobile page?
When I search for my brandname, the mobile version of the customer support page is shown in the (desktop) results. We use a m.example.nl mobile webpage. To try to solve the problem, we’ve adjusted the following: Made sure the homepage is marked according to schema.org Homepage expanded with textual content and headings containing our brandname Removed all the textual content from the mobile customer support page Added the mobile customer support page to the mobile sitemap What can we change more in settings/marking/sitemap, to make sure our desktop homepage is shown in the brandname results?
Technical SEO | | WillieBV0 -
Technical guide for Setting up a CDN to host our images, as well as creating an image sitemap, and setting up the CDN in GWT?
Hi All! We're thinking of setting up a CDN to host our images with a CNAME on a subdomain of our site. In terms of SEO, I was wondering if any of you knew of a pretty complete technical guide for setting it all up. Including whether or not we need to create an image sitemap, and setting it up in GWT. Thanks in advance! Vince
Technical SEO | | jbrisebois0 -
302 redirected links not found
There are so many 302 redirected links you found among which most are for the pages which needs users to login to view the pages so redirection to login page is unavoidable. For example: https://www.stopwobble.com/wishlist/index/add/product/98199/form_key/QE0kEzOF2yO3DTtt/ Also we don't have product compare functionlity, but still there are so many links from compare page which redirects to respective category page. For exammple: http://www.stopwobble.com/catalog/product_compare/add/product/98199/uenc/aHR0cDovL3d3dy5zdG9wd29iYmxlLmNvbS93b2JibGUtd2VkZ2Vz/form_key/QE0kEzOF2yO3DTtt/ We need to know from where Moz crawler is detecting these links so that we can supress them from being crawled. I already tries to review overall site and confirmed these links nowhere exists in page source or in sitemap.xml
Technical SEO | | torbett0 -
GWT Soft 404 count is climbing. Important to fix?
In GWT I am seeing my mobile site's soft 404 count slowly rise from 5 two weeks ago to over 100 as of today. If I do nothing I expect it will continue to rise into the thousands. This is due to there being followed links on external sites to thousands of discontinued products we used to offer. The landing page for these links simply says the product is no longer available and gives links to related areas of our site. I know I can address this by returning a 404 for these pages, but doing so will cause these pages to be de-indexed. Since these pages still have utility in redirecting people to related, available products, I want these pages to stay in the index and so I don't want to return a 404. Another way of addressing this is to add more useful content to these pages so that Google no longer classifies them as soft 404. I have images and written content for these pages that I'm not showing right now, but I could show if necessary. But before investing any time in addressing these soft 404s, does anyone know the real consequences of not addressing them? Right now I'm getting 275k pages indexed and historically crawl budget has not been an issue on my site, nor have I seen any anomalous crawl activity since the climb in soft 404s began. Unchecked, the soft 404s could climb to 20,000ish. I'm wondering if I should start expecting effects on the crawl, and also if domain authority takes a hit when there are that many soft 404s being reported. Any information is appreciated.
Technical SEO | | merch_zzounds0 -
GWT Change of Address Keeps Failing
Followed Google's instructions for using the Change of Address Tool in GWT to move rethinkisrael.org to www.fromthegrapevine.com. I'm getting this message, "We tried to reconfirm ownership of your old site (rethinkisrael.org) and failed. Make sure your verification token is present, and try again."Even though the site is verified, we undid the DNS change, and checked the meta verification tag. The tag is correct. And, since the site is ALREADY verified there was NO way to 'veryify' in GWT again. The message in GWT says "verification successful."We redid the DNS change, tried again to do the address change and get the same error message. Any ideas?
Technical SEO | | Aggie0 -
GWT Duplicate Content and Canonical Tag - Annoying
Hello everyone! I run an e-commerce site and I had some problems with duplicate meta descriptions for product pages. I implemented the rel=canonical in order to address this problem, but after more than a week the number of errors showing in google webmaster tools hasn't changed and the site has been crawled already three times since I put the rel canonical. I didn't change any description as each error regards a set of pages that are identical, same products, same descriptions just different length/colour. I am pretty sure the rel=canonical has been implemented correctly so I can't understand why I still have these errors coming up. Any suggestions? Cheers
Technical SEO | | PremioOscar0 -
Google WMT continues reporting fixed 404s - why?
I work with a news site that had a heavy restructuring last spring. This involved removing many pages that were duplicates, tags, etc. Since then, we have taken very careful steps to remove all links coming into these deleted pages, but for some reason, WMT continues to report them. By last August, we had cleared over 10k 404s to our site, but this lasted only for about 2 months and they started coming back. The "linked from" gives no data, and other crawlers like seomoz aren't detecting any of these errors. The pages aren't in the sitemap and I've confirmed that they're not really being linked from from anywhere. Why do these pages keep coming back? Should I even bother removing them over and over again? Thanks -Juanita
Technical SEO | | VoxxiVoxxi0 -
424 Crawl Notices Found - Most of these notices are 301 redirects for our blog. Are notices something that would keep me from ranking well for my keywords?
212 are rel canonical and 176 are 301 permanent re-direct. An example of the re-direct is a change I made to the /trackback 302 status on my blog like; http://www.bluesunproperties.com/2012-spring-biker-rally-thunder-beach/trackback/ Are these Crawl Notices something that I should spend resources on, or should I focus more on my errors and warnings?
Technical SEO | | classa0