Non existant URLs being generated in index
-
Hi all,
I have a pretty big problem with my site at the moment which I'm worried will have an impact on my rankings.
I've just had a crawl test done and for some reason I get a load of urls returned that don't actually exist...
For example I am getting urls like this in my crawl test and xml sitemap:
All the urls seem to start off with www.applicablejobs.com/jobs/ and there is an entry for every conceivable combination of slugs.
I can only assume that if the crawl test and an xml sitemap generator is indexing these urls then Google and other search engines probably are too.
Does anyone have any idea what might be causing this issue and what can I do to remove them from Googles index if they are?
Thanks
-
Could they be archived links from years ago?
I have the same problem. Products we used to sell but either no longer sell or are out of stock (they are made inactive in the CMS and do not appear on site) show up in some google searches and in the crawl test.
Any ideas?
Cheers
Will
-
If you search for this in Goggle: site:www.applicablejobs.com
You see 43 URLs and none of the bad ones.
-
Okay. Well in that case I cannot speak to why they are happening in the first place. To keep them out of the index you could have exclude the entire /jobs/ directory using the robots.txt. If the /jobs/ directory is needed then you'll have to track down the source of the URL generation. Sorry I can be of more help.
-
Hi Stephan,
applicablejobs.com is my url yes.
-
Is your domain "www.applicablejobs.com"? If not, it sounds like you may have been hacked and someone added some code snippet to your website. I host some personal sites on Network Solutions and one day I found some strange code snippet on just about every page of the sites I run. After removing the code I had to upload every page again but only after changing all my passwords.
As for removing them? Google has a tool to remove them. However if this is not your domain - you may want to email Google and inform them of the malicious happenings.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Resolving 301 Redirect Chains from Different URL Versions (http, https, www, non-www)
Hi all, Our website has undergone both a redesign (with new URLs) and a migration to HTTPS in recent years. I'm having difficulties ensuring all URLs redirect to the correct version all the while preventing redirect chains. Right now everything is redirecting to the correct version but it usually takes up to two redirects to make this happen. See below for an example. How do I go about addressing this, or is this not even something I should concern myself with? Redirects (2) <colgroup><col width="123"><col width="302"></colgroup>
Technical SEO | | theyoungfirm
| Redirect Type | URL |
| | http://www.theyoungfirm.com/blog/2009/index.html 301 | https://theyoungfirm.com/blog/2009/index.html 301 | https://theyoungfirm.com/blog/ | This code below was what we added to our htaccess file. Prior to adding this, the various subdomain versions (www, non-www, http, etc.) were not redirecting properly. But ever since we added it, it's now created these additional URLs (see bolded URL above) as a middle step before resolving to the correct URL. RewriteEngine on RewriteCond %{HTTP_HOST} ^www.(.*)$ [NC] RewriteRule ^(.*)$ https://%1/$1 [R=301,L] RewriteCond %{HTTPS} !on RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L] Your feedback is much appreciated. Thanks in advance for your help. Sincerely, Bethany0 -
Home Page Being Indexed / Referral URLs /
I have a few questions related to home page URLs being indexed, canonicalization, and GA reporting... 1. I can view the home page by typing in domain.com , domain.com/ and domain.com/index.htm There are no redirects and it's canonicalized to point to domain.com/index.htm -- how important is it to have redirects? I don't want unnecessary redirects or canonical tags, but I noticed the trailing slash can sometimes be typed in manually on other pages, sometimes not. 2. When I do a site search (site:domain.com), sometimes the HP shows up as "domain.com/", never "domain.com/index.htm" or "domain.com", and sometimes the HP doesn't show up period. This seems to change several times a day, sometimes within 15 minutes. I have no idea what is causing it and I don't know if it has anything to do with #1. In a perfect world, I would ask for the /index.htm to be dropped and redirected to .com/, and the canonical to point to .com/ 3. I've noticed in GA I see / , /index.htm, and a weird Google referral URL (/index.htm?referrer=https://www.google.com/) all showing up as top pages. I think the / and /index.htm is because I haven't setup a default URL in GA, but I'm not sure what would cause the referrer. I tracked back when the referrer URL started to show up in the top pages, and it was right around the time they moved over to https://, so I'm not sure what the best option is to remove that. I know this is a lot - I appreciate any insight anyone can provide.
Technical SEO | | DigMS0 -
Www to non www on a .com/blog url
hi guys, I have had to reset my site from www to non-www. via htacces and this worked out just fine.However, the /blog WordPress section will not redirect to the non-www. I have changed the config.php to non-www. However, the /blog WordPress section will not redirect to the non-www. I have changed the config.php to non-www. Does anyone have an idea as to what I need to do to force the non-www in a folder installed blog http://5starweddingdirectory.com/ http://www.5starweddingdirectory.com/blog/ Regards T
Technical SEO | | Taiger0 -
50 Duplicate URLS, but not the same
Hi According to my latest site crawl, many of my pages are showing up to 50 duplicate urls. However this isn't the case in real life. http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/hitachi/ex-33mu.html is showing 31 duplicate URL. Examples include: http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/parts/x430.html
Technical SEO | | JDadd
http://www.fortusgroup.com.au/browse-products/rubber-tracks/excavator-rubber-tracks/case/cx-75sr.html Obviously these URL's are very similar and I know that Moz judges URLs by 90% of their similarity, but is this affecting my actual raking on google? If so, what can I do? This pages are also very similar in code and content, so they are also showing as duplicate content etc as well. Worried that this is having an affect on my SERP rankings, as this pages arent ranking particularly well. Thanks, Ellie0 -
Carwling and indexing problems
hi, i have noticed since my site was upgraded that google is taking a long time to publish my articles. before the upgrade google would publish the article straight away, but now it takes an average of around 4 days. the article i am talking about at the moment is here http://www.in2town.co.uk/celebrities-in-the-news/stuart-hall-has-his-prison-sentence-for-sex-crimes-doubled-to-30-months now i have a blog here on blogger and the article was picked up within six mins http://showbizgossipandnews.blogspot.co.uk/2013/07/stuart-hall-has-his-prison-sentence-for.html so i am just wondering what the problem is and what i need to solve this my problem is, my site is mostly a news site so it is no good to me if google is publishing new stories every four days, any help would be great.
Technical SEO | | ClaireH-1848860 -
Temporary Redirect - on nonexistant URL
I'm getting a Temporary Redirect issue on | http://www.luckygemstones.com/botswana-legends.htm http://www.luckygemstones.com/botswana-legends.htm | http://www.luckygemstones.com/page-not-found.htm | 1 | 0 | 302 | YET! There is no such page on my site. I believe I had one once, but has been corrected for a while now. WHY is SEOMOZ picking this up as an error and how can I fix? Kathleen http://www.luckygemstones.com
Technical SEO | | spkcp1110 -
How to keep a URL social equity during a URL structure/name change?
We are in the process of making significant URL name/structure change to one of our property and we want to keep the social equity (likes, share, +1, tweets) from the old to the new URL. We have been trying many different option without success. We are running our social "button" in an iframe. Thanks
Technical SEO | | OlivierChateau0 -
We changed the URL structure 10 weeks ago and Google hasn't indexed it yet...
We recently modified the whole URL structure on our website, which resulted in huge amount of 404 pages changing them to nice human readable urls. We did this in the middle of March - about 10 weeks ago... We used to have around 5000 404 pages in the beginning, but this number is decreasing slowly. (We have around 3000 now). On some parts of the website we have also set up a 301 redirect from the old URLs to the new ones, to avoid showing a 404 page thus making the “indexing transmission”, but it doesn’t seem to have made any difference. We've lost a significant amount of traffic, because of the URL changes, as Google removed the old URLs, but hasn’t indexed our new URLs yet. Is there anything else we can do to get our website indexed with the new URL structure quicker? It might also be useful to know that we are a page rank 4 and have over 30,000 unique users a month so I am sure Google often comes to the site quite often and pages we have made since then that only have the new url structure are indexed within hours sometimes they appear in search the next day!
Technical SEO | | jack860