Mystery 404's
-
I have a large number of 404's that all have a similar structure: www.kempruge.com/example/kemprugelaw. kemprugelaw keeps getting stuck on the end of url's. While I created www.kempruge.com/example/ I never created the www.kempruge.com/example/kemprugelaw page or edited permalinks to have kemprugelaw at the end of the url. Any idea how this happens? And what I can do to make it stop?
Thanks,
Ruben
-
One by one is fine with me. I'd much prefer that to screwing up the site.
Thanks again,
Ruben
-
Hi Ruben
I'm glad that has helped you
There is one way you could do multiple updates BUT I would not recommend it as doing it wrong could screw up your site. You could do it via the control panel in your site's hosting by querying your MySQL database via PHPMyAdmin and doing a bulk search and update for all references to www.kempruge.com where it doesn't have http:// in front and replacing www.kemruge.com with http://www.kempruge.com.
Although it is a pain I know, the best way is to fix the errors one by one in the pages themselves and leave the redirects running until you are sure that Google, Bing and Yahoo have updated their indexes, then you can remove them.
If you copy http:// onto your Mac/PC clipboard, then it will make it quicker to open the link dialog and paste at the start of the URL.
Peter
-
Peter,
You're a genius! I'm almost certain that's it, because I can't remember adding "http://" Is there a way to get rid of those pages? I just 301 redirected them to where they are supposed to go, but I have a lot of redirects. When I say a lot, I mean a lot relative to how many pages I have. We have 500 something indexed pages, and probably 200 something redirects. I know that many redirects slows our site down. I'd like to know if there's any better option that the 301s, if I can't just delete them.
Thanks,
Ruben
-
Hi Ruben
You mentioned: In GWT, the 404s are slightly different. They are www.kempruge.com/example/www.kempruge.com
I have seen this type of thing before, or something similar, when an absolute link has been entered into some anchor text or by itself without adding http:// before the link.
So the link has been entered as www.mydomain.com - which causes the error - but it should be entered as http://www.mydomain.com
Your issue may be something completely different, but I thought I would post this as a possible solution.
Peter
-
In GWT, the 404s are slightly different. They are www.kempruge.com/example/www.kempruge.com
In BWT, it's the www.kempruge.com/example/kemprugelaw
In GWT, they say the 404's are coming from my site, but I couldn't find out where it says that for BWT.
Any thoughts, and thanks for helping out. This has been bothering me for awhile.
Ruben
-
It says it in Webmaster Tools, does that matter? I'm going to check on where from now. Also, I know my sitemap 404's, but I can't figure out what happened. If you go here: http://www.kempruge.com/category/news/feed/ that's my sitemap. How it got changed to that, I have no idea. Plus, I can't find that page in the backend of WP to change the url back to the old one.
I tried redirecting the proper sitemap name to the one that works, but that didn't seem to work.
-
I crawled your site and didn't see the 404 errors.
I did notice that your sitemap in your robots.txt 404's so you may want to take a look at that.
-
Are you seeing these 404s in Webmaster Tools or when crawling the site?
If WMT where does it say the 404 is linked to from? Click on the URL with the 404 error in WMT and select the "Linked from" tab.
Crawl the site with Screaming Frog and your user agent set to Googlebot. See if the same 404 errors are being picked up and if so, you can click on them and select the "In Links" tab to see what page the 404 is being picked up on.
I checked the source code of some of the pages on www.kempruge.com and didn't see any relative links which usually create problems like this. My bet is on a site scraping your site and creating 404 errors when they link back to your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
Pages excluded from Google's index due to "different canonicalization than user"
Hi MOZ community, A few weeks ago we noticed a complete collapse in traffic on some of our pages (7 out of around 150 blog posts in question). We were able to confirm that those pages disappeared for good from Google's index at the end of January '18, they were still findable via all other major search engines. Using Google's Search Console (previously Webmastertools) we found the unindexed URLs in the list of pages being excluded because "Google chose different canonical than user". Content-wise, the page that Google falsely determines as canonical instead has little to no similarity to the pages it thereby excludes from the index. False canonicalization About our setup: We are a SPA, delivering our pages pre-rendered, each with an (empty) rel=canonical tag in the HTTP header that's then dynamically filled with a self-referential link to the pages own URL via Javascript. This seemed and seems to work fine for 99% of our pages but happens to fail for one of our top performing ones (which is why the hassle 😉 ). What we tried so far: going through every step of this handy guide: https://moz.com/blog/panic-stations-how-to-handle-an-important-page-disappearing-from-google-case-study --> inconclusive (healthy pages, no penalties etc.) manually requesting re-indexation via Search Console --> immediately brought back some pages, others shortly re-appeared in the index then got kicked again for the aforementioned reasons checking other search engines --> pages are only gone from Google, can still be found via Bing, DuckDuckGo and other search engines Questions to you: How does the Googlebot operate with Javascript and does anybody know if their setup has changed in that respect around the end of January? Could you think of any other reason to cause the behavior described above? Eternally thankful for any help! ldWB9
Intermediate & Advanced SEO | | SvenRi1 -
Re: Inbound Links. Whether it's HTTP or HTTPS, does it still go towards the same inbound link count?
Re: Inbound Links. If another website links to my website, does it make a difference to my inbound link count if they use http or https? Basically, my site http://mysite.com redirects to https://mysite.com, so if another website uses the link http://mysite.com, will https://mysite.com still benefit from the inbound links count? I'm unsure if I should reach out to all my inbound links to tell them to use my https URL instead...which would be rather time consuming so just checking http and https counts all the same. Thanks.
Intermediate & Advanced SEO | | premieresales0 -
What can cause for a service page to rank in Google's Answer Box?
Hello Everyone, Have recently seen a Google result for "vps hosting" showing service page details in Answer Box. I would really like to know, what can cause a service page to appear in the Answer Box? Have attached a screenshot of result page. CaRiWtQUcAALn9n.png CaRiWtQUcAALn9n.png
Intermediate & Advanced SEO | | eukmark0 -
What's the best possible URL structure for a local search engine?
Hi Mozzers, I'm working at AskMe.com which is a local search engine in India i.e if you're standing somewhere & looking for the pizza joints nearby, we pick your current location and share the list of pizza outlets nearby along with ratings, reviews etc. about these outlets. Right now, our URL structure looks like www.askme.com/delhi/pizza-outlets for the city specific category pages (here, "Delhi" is the city name and "Pizza Outlets" is the category) and www.askme.com/delhi/pizza-outlets/in/saket for a category page in a particular area (here "Saket") in a city. The URL looks a little different if you're searching for something which is not a category (or not mapped to a category, in which case we 301 redirect you to the category page), it looks like www.askme.com/delhi/search/pizza-huts/in/saket if you're searching for pizza huts in Saket, Delhi as "pizza huts" is neither a category nor its mapped to any category. We're also dealing in ads & deals along with our very own e-commerce brand AskMeBazaar.com to make the better user experience and one stop shop for our customers. Now, we're working on URL restructure project and my question to you all SEO rockstars is, what can be the best possible URL structure we can have? Assume, we have kick-ass developers who can manage any given URL structure at backend.
Intermediate & Advanced SEO | | _nitman0 -
Chinese Sites Linking With Bizarre Keywords Creating 404's
Just ran a link profile, and have noticed for the first time many spammy Chinese sites linking to my site with spammy keywords such as "Buy Nike" or "Get Viagra". Making matters worse, they're linking to pages that are creating 404's. Can anybody explain what's going on, and what I can do?
Intermediate & Advanced SEO | | alrockn0 -
Using the same content on different TLD's
HI Everyone, We have clients for whom we are going to work with in different countries but sometimes with the same language. For example we might have a client in a competitive niche working in Germany, Austria and Switzerland (Swiss German) ie we're going to potentially rewrite our website three times in German, We're thinking of using Google's href lang tags and use pretty much the same content - is this a safe option, has anyone actually tries this successfully or otherwise? All answers appreciated. Cheers, Mel.
Intermediate & Advanced SEO | | dancape1 -
Soft 404
Hey forum, My site is a Price Comparison site. Lately I've been getting some "Soft 404" errors with the Webmaster tool. I'll try to explain the steps causing it: 1. There's a valid link to a product 2. At some point the product is temporary out of stock or unavailable. 3. Google crawls this product page, getting a valid page with a message explaining this product is unavailable at this time. 4. Google see this page for few different products and (I assume) figures it's a none existing page and so it's a soft 404. The possible solutions I see are: 1. Return real 404, I'm not a fan of this solution, because these links will very likely be valid again when the product is back in stock. 2. Live with some "soft 404" errors in the webmaster tool. 3. Find another way to explain to Google that it's not a real 404. This sounds great but I'm not sure how this can be done. Any thoughts which would be the best method? Or maybe another solution I haven't thought of? Thank you.
Intermediate & Advanced SEO | | corwin0