Non existant URLs being generated in index

Benji87

Hi all,

I have a pretty big problem with my site at the moment which I'm worried will have an impact on my rankings.

I've just had a crawl test done and for some reason I get a load of urls returned that don't actually exist...

For example I am getting urls like this in my crawl test and xml sitemap:

www.applicablejobs.com/jobs/add/android-designer/android-designer/android-designer/android-developer/android-developer/

www.applicablejobs.com/jobs/add/android-designer/android-designer/android-designer/android-developer/iphone-designer/

All the urls seem to start off with www.applicablejobs.com/jobs/ and there is an entry for every conceivable combination of slugs.

I can only assume that if the crawl test and an xml sitemap generator is indexing these urls then Google and other search engines probably are too.

Does anyone have any idea what might be causing this issue and what can I do to remove them from Googles index if they are?

Thanks

WillBlackburn

Could they be archived links from years ago?

I have the same problem. Products we used to sell but either no longer sell or are out of stock (they are made inactive in the CMS and do not appear on site) show up in some google searches and in the crawl test.

Any ideas?

Cheers

Will

STPseo

If you search for this in Goggle: site:www.applicablejobs.com

You see 43 URLs and none of the bad ones.

STPseo

Okay. Well in that case I cannot speak to why they are happening in the first place. To keep them out of the index you could have exclude the entire /jobs/ directory using the robots.txt. If the /jobs/ directory is needed then you'll have to track down the source of the URL generation. Sorry I can be of more help.

Benji87

Hi Stephan,

applicablejobs.com is my url yes.

STPseo

Is your domain "www.applicablejobs.com"? If not, it sounds like you may have been hacked and someone added some code snippet to your website. I host some personal sites on Network Solutions and one day I found some strange code snippet on just about every page of the sites I run. After removing the code I had to upload every page again but only after changing all my passwords.

As for removing them? Google has a tool to remove them. However if this is not your domain - you may want to email Google and inform them of the malicious happenings.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Non existant URLs being generated in index

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Clean URL vs. Parameter URL and Using Canonical URL...That's a Mouthfull!

Why are only a few of our pages being indexed

Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)

Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?

Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT

Getting images indexed in the SERPS

URL structure

Are lots of links from an external site to non-existant pages on my site harmful?