Non existant URLs being generated in index
-
Hi all,
I have a pretty big problem with my site at the moment which I'm worried will have an impact on my rankings.
I've just had a crawl test done and for some reason I get a load of urls returned that don't actually exist...
For example I am getting urls like this in my crawl test and xml sitemap:
All the urls seem to start off with www.applicablejobs.com/jobs/ and there is an entry for every conceivable combination of slugs.
I can only assume that if the crawl test and an xml sitemap generator is indexing these urls then Google and other search engines probably are too.
Does anyone have any idea what might be causing this issue and what can I do to remove them from Googles index if they are?
Thanks
-
Could they be archived links from years ago?
I have the same problem. Products we used to sell but either no longer sell or are out of stock (they are made inactive in the CMS and do not appear on site) show up in some google searches and in the crawl test.
Any ideas?
Cheers
Will
-
If you search for this in Goggle: site:www.applicablejobs.com
You see 43 URLs and none of the bad ones.
-
Okay. Well in that case I cannot speak to why they are happening in the first place. To keep them out of the index you could have exclude the entire /jobs/ directory using the robots.txt. If the /jobs/ directory is needed then you'll have to track down the source of the URL generation. Sorry I can be of more help.
-
Hi Stephan,
applicablejobs.com is my url yes.
-
Is your domain "www.applicablejobs.com"? If not, it sounds like you may have been hacked and someone added some code snippet to your website. I host some personal sites on Network Solutions and one day I found some strange code snippet on just about every page of the sites I run. After removing the code I had to upload every page again but only after changing all my passwords.
As for removing them? Google has a tool to remove them. However if this is not your domain - you may want to email Google and inform them of the malicious happenings.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index and Crawl Budget
Hello, If we noindex pages, will it improve crawl budget ? For example pages like these - https://x-z.com/2012/10/
Technical SEO | | Johnroger
https://x-y.com/2012/06/
https://x-y.com/2013/03/
https://x-y.com/2019/10/
https://x-y.com/2019/08/ Should we delete/redirect such pages ? Thanks0 -
Google tries to index non existing language URLs. Why?
Hi, I am working for a SAAS client. He uses two different language versions by using two different subdomains.
Technical SEO | | TheHecksler
de.domain.com/company for german and en.domain.com for english. Many thousands URLs has been indexed correctly. But Google Search Console tries to index URLs which were never existing before and are still not existing. de.domain.com**/en/company
en.domain.com/de/**company ... and an thousand more using the /en/ or /de/ in between. We never use this variant and calling these URLs will throw up a 404 Page correctly (but with wrong respond code - we`re fixing that 😉 ). But Google tries to index these kind of URLs again and again. And, I couldnt find any source of these URLs. No Website is using this as an out going link, etc.
We do see in our logfiles, that a Screaming Frog Installation and moz.com w opensiteexplorer were trying to access this earlier. My Question: How does Google comes up with that? From where did they get these URLs, that (to our knowledge) never existed? Any ideas? Thanks 🙂0 -
Merge 2 websites into one, using a non-existing, new domain.
I need to merge https://www.WebsiteA.com and https://www.WebsiteB.com to a fresh new domain (with no content) https://www.WebsiteC.com. I want to do it the best way to keep existing SEO juice. Website A is the companies home page and built with Wordpress Website B is the company product page and built with Wordpress Website C will be the new site containing both website A and B, utilizing Wordpress also. What is the best way to do this? I have research a lot and keep hitting walls on how to do it. It's a little trickier because it's two different domains going to a brand new domain. Thanks
Technical SEO | | jarydcat10 -
Use existing page with bad URL or brand new URL?
Hello, We will be updating an existing page with more helpful information with the goal of reaching more potential customers through SEO and also attaching a SEM campaign to the specific landing page. The current URL of the page scores 25 on Page Authority, and has 2 links to it from blog articles (PA 35, 31). The current content needs to be rewritten to be more helpful and also needs some additional information. The downsides are that it has an "bad" URL- no target keyword and uses underscores. Which of the following choices would you make? 1. Update this old "bad" URL with new content. Benefit from the existing PA. -or- 2. Start with a new optimized URL, reusing some of the old content and utilizing a 301 redirect from the previous page? Thank you!
Technical SEO | | XLMarketing0 -
Friendly URLs
Hi, I have an important news site and I am trying to implement user friendly URLs. Now, when you click a news in the homepage, it goes to a redirect.php page and then goes to a friendly url. the question is, It is better to have the friendly URL in the first link or it is the same for the robot having this in the finally url? Thanks
Technical SEO | | informatica8100 -
Can I redirect a URL that has a # in it? How?
Hi there - My web developer is saying that I can't do a URL redirect with a "#" in it. Currently, the URL is actually an anchored link within a page (which the URL indicates with a #). I want to change the content to a new URL, but our website links internally to the old URL, so we would need to do a URL redirect (assume 301). Can you tell me if this is possible and how? Thanks!
Technical SEO | | sfecommerce0 -
Dynamic Parameters in URL
I have received lots of warnings because of long urls. Most of them are because my website has many Attributes to FILTER out products. And each time the user clicks on one, its added to the URL. pls see my site here: www.theprinterdepo.com The warning is here: Although search engines can crawl dynamic URLs, search engine representatives have warned against using over 2 parameters in any given URL. The question to the community is: -What should I do? These attributes really help the user to find easier the products. I could remove some of the attributes, I am not sure if my ecommerce solution (MAGENTO), allows to change the behavior of this so that this does not use querystring parameters.
Technical SEO | | levalencia10 -
Hyphen in URL
Hi, I would like to know if the following statement holds true today or it doesn't matter whether we use hyphens or underscore If you have a URL like keyword1_keyword2, Google will only return that page if the user searches for keyword1_keyword2 ( highly unlikely ) . But If you have a URL like keyword1-keyword2, that page can be returned for the searches - keyword1,keyword2 and even “keyword1keyword2” Thanks
Technical SEO | | seoug_20050