404s in GWT - Not sure how they are being found
-
We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart.
The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work.
I can't figure out how Google is finding these pages. Could it be removing the query string?
Thanks.
-
Kelli - the first thing I thought was what garfield_disliker asks: have you set up Google Webmaster Tools to ignore these parameters that are important for the cart page to load?
That said, Google Webmaster Tools is run by a team that's separate from the primary search team, so it's possible that GWT is flagging an issue that isn't an actual issue for Google. Run a search in Google for "site:yourdomain.com/UpdateCart" and see what URLs Google has indexed. If they have that 404ing URL, that's not good. If they have correct URLs, it's possible that this is a Google Webmaster Tools thing.
-
Hi,
Are you using the /updateCart url in goal tracking or pushing events to analytics using this url? I have seen GWT pick up 404's from us pushing virtual (non existing) page views to analytics for goal tracking etc. Just a thought.
-
First, you can never be sure there are no external links. Open Site Explorer's index (and any other link analysis tool) is not a full picture, and Google doesn't always provide all the inbound links to your site. The junkier the scraper, the less likely you will see the link.
Secondly, could you provide a concrete example of this?
Where is the page (with parameters) linked from/to on your site? How is your site appending those parameters to the URL? Does it send users through a redirect to get to that URL? It might be useful to run your own crawl (w/ Screaming Frog or any other crawling software) of the site and take a look at all the internal links and the response codes.
Also have you set up Google WMT to ignore any parameters?
It's certainly possible that Google's crawlers are stripping parameters on their own.
-
We do not dynamically inject canonicals into the page. They are also not old URLs because they have never been valid URLs.
They are all linked from internal pages, but when I look at those pages, the URL with the query string is the only URL that is being pointed to, not the partial URL. There are no external links.
Thanks,
Kelli -
In WMT click on the URL that is 404'd and then select "linked to from". It will show you where Google is picking up the 404 error.
Are these 404 pages being linked to from an external site? Sometimes the 404s that appear in WMT are from links pointing to your domain from an external site, often one that has scraped your site.
-
Does your website dynamically inject canonical links into the page? Some content management systems will automatically generate canonicals that strip parameters from the URL. If that's the case then that might be why you wouldn't see it in your ordinary site structure.
It's also possible that it's an old URL that Google indexed which is no longer on your site or something that is linked externally somewhere, so the crawlers are finding it somewhere off site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Pages on GWT when redesigning website
Hi, we recently redesigned our online shop. We have done the 301 redirects for all product pages to the new URL (and went live about 1.5 week ago), but GWT indicated that the old product URL and the new product URL are 2 different pages with the same meta title tags (duplication) - when in fact, the old URL is 301 redirecting to the new URL when visited. I found this article on google forum: https://productforums.google.com/forum/#!topic/webmasters/CvCjeNOxOUw
Technical SEO | | Essentia
It says we either just wait for Google to re-crawl, of use the fetch URL function for the OLD URLs. Question is, after i fetch the OLD URL to tell Google that it's being redirected, should i click the button 'submit to index' or not? (See screengrab - please note that it was the OLD URL that was being fetched, not the NEW URL). I mean, if i click this button, is it telling Google that: a. 'This old URL has been redirected, therefore please index the new URL'? or
b. 'Please keep this old URL in your index'? What's your view on this? Thanks1 -
GWT Change of Address Keeps Failing
Followed Google's instructions for using the Change of Address Tool in GWT to move rethinkisrael.org to www.fromthegrapevine.com. I'm getting this message, "We tried to reconfirm ownership of your old site (rethinkisrael.org) and failed. Make sure your verification token is present, and try again."Even though the site is verified, we undid the DNS change, and checked the meta verification tag. The tag is correct. And, since the site is ALREADY verified there was NO way to 'veryify' in GWT again. The message in GWT says "verification successful."We redid the DNS change, tried again to do the address change and get the same error message. Any ideas?
Technical SEO | | Aggie0 -
Google WMT continues reporting fixed 404s - why?
I work with a news site that had a heavy restructuring last spring. This involved removing many pages that were duplicates, tags, etc. Since then, we have taken very careful steps to remove all links coming into these deleted pages, but for some reason, WMT continues to report them. By last August, we had cleared over 10k 404s to our site, but this lasted only for about 2 months and they started coming back. The "linked from" gives no data, and other crawlers like seomoz aren't detecting any of these errors. The pages aren't in the sitemap and I've confirmed that they're not really being linked from from anywhere. Why do these pages keep coming back? Should I even bother removing them over and over again? Thanks -Juanita
Technical SEO | | VoxxiVoxxi0 -
132 pages reported as having Duplicate Page Content but I'm not sure where to go to fix the problems?
I am seeing “Duplicate Page Content” coming up in our
Technical SEO | | danatanseo
reports on SEOMOZ.org Here’s an example: http://www.ccisolutions.com/StoreFront/product/williams-sound-ppa-r35-e http://www.ccisolutions.com/StoreFront/product/aphex-230-master-voice-channel-processor http://www.ccisolutions.com/StoreFront/product/AT-AE4100.prod These three pages are for completely unrelated products.
They are returning “200” status codes, but are being identified as having
duplicate page content. It appears these are all going to the home page, but it’s
an odd version of the home page because there’s no title. I would understand if these pages 301-redirected to the home page if they were obsolete products, but it's not a 301-redirect. The referring page is
listed as: http://www.ccisolutions.com/StoreFront/category/cd-duplicators None of the 3 links in question appear anywhere on that page. It's puzzling. We have 132 of these. Can anyone help me figure out
why this is happening and how best to fix it? Thanks!0 -
GWT, URL Parameters, and Magento
I'm getting into the URL parameters in Google Webmaster Tools and I was just wondering if anyone that uses Magento has used this functionality to make sure filter pages aren't being indexed. Basically, I know what the different parameters (manufacturer, price, etc.) are doing to the content - narrowing. I was just wondering what you choose after you tell Google what the parameter's function is. For narrowing, it gives the following options: Which URLs with this parameter should Googlebot crawl? <label for="cup-crawl-LET_GOOGLEBOT_DECIDE">Let Googlebot decide</label> (Default) <label for="cup-crawl-EVERY_URL">Every URL</label> (the page content changes for each value) <label style="color: #5e5e5e;" for="cup-crawl-ONLY_URLS_WITH_VALUE">Only URLs with value</label> ▼(may hide content from Googlebot) <label for="cup-crawl-NO_URLS">No URLs</label> I'm not sure which one I want. Something tells me probably "No URLs", as this content isn't something a user will see unless they filter the results (and, therefore, should not come through on a search to this page). However, the page content does change for each value.I want to make sure I don't exclude the wrong thing and end up with a bunch of pages disappearing from Google.Any help with this is greatly appreciated!
Technical SEO | | Marketing.SCG0 -
Why does GWT fine duplicate descriptions where none exist?
On my website, www.heartspm.com, I have been gradually changing over local information to remove near duplicate content. I created templates to create near duplicates for regions (coastal, inland, valley, etc. Then I also have been gradually creating unique pages for each city, such as Poway, Santa Monica, etc. Either way, I have this component of the website built on a list of cities with SEO information unique to each one pulling in template information. Now to my question: I am getting most of these pages coming up with GWT duplicate meta description information even though the descriptions are quite different between city to city. I am viewing the source page generated as well as the descriptions that are displayed by Google on the search engine. These descriptions do not appear to duplicate to me. So why the error?
Technical SEO | | GerryWeitz0 -
Not sure which URL to use for 301 redirect
A client has new website design completed by another developer, was launched in April of this year. No 301 redirect was set up so duplicate content is an issue. Client has had a website with same domain name for about 10 years, but has not had any SEO work completed before or since his new site design. For non-www there are 6 referring links - 1 considered to have authority, for www there are also 6 but 3 considered to have authority. More links seem to coming from www than non-www. But for one of the clients keywords they are ranked #1 for their area and that links to their non-www address. And even though no redirects set up by developer, non-www has had far more visits according to Google Analytics. So many basics that still need to be done for site: no meta-descriptions on any page, H1 and page titles could use keywords, call to action moved above fold, etc. Considering this is a new site, and new SEO work and many more inbound links needed, does it matter which address I redirect to? _Cindy Barnard
Technical SEO | | CeCeBar0 -
Pages not being found in serp
Hi I'm helping a collegue with his website. For what ever reason the pages in the Solutions Menu are not being found in the search result for keywords related to the pages. (Homepage mainly comes up in the search result). Does anyone have any advise to why this may be happening? *To give you a bit of a background understanding, previously all the menu content was copied (which I made him change), he also had hidden text on some pages (i made him remove, white text on white background) plus the url structure changed as well. Persoanlly I think he is over using , links, internal linking is not great & the general content is not great in the menu. Your Thoughts are welcomed, thank you.
Technical SEO | | Socialdude0