Non existant URLs being generated in index
-
Hi all,
I have a pretty big problem with my site at the moment which I'm worried will have an impact on my rankings.
I've just had a crawl test done and for some reason I get a load of urls returned that don't actually exist...
For example I am getting urls like this in my crawl test and xml sitemap:
All the urls seem to start off with www.applicablejobs.com/jobs/ and there is an entry for every conceivable combination of slugs.
I can only assume that if the crawl test and an xml sitemap generator is indexing these urls then Google and other search engines probably are too.
Does anyone have any idea what might be causing this issue and what can I do to remove them from Googles index if they are?
Thanks
-
Could they be archived links from years ago?
I have the same problem. Products we used to sell but either no longer sell or are out of stock (they are made inactive in the CMS and do not appear on site) show up in some google searches and in the crawl test.
Any ideas?
Cheers
Will
-
If you search for this in Goggle: site:www.applicablejobs.com
You see 43 URLs and none of the bad ones.
-
Okay. Well in that case I cannot speak to why they are happening in the first place. To keep them out of the index you could have exclude the entire /jobs/ directory using the robots.txt. If the /jobs/ directory is needed then you'll have to track down the source of the URL generation. Sorry I can be of more help.
-
Hi Stephan,
applicablejobs.com is my url yes.
-
Is your domain "www.applicablejobs.com"? If not, it sounds like you may have been hacked and someone added some code snippet to your website. I host some personal sites on Network Solutions and one day I found some strange code snippet on just about every page of the sites I run. After removing the code I had to upload every page again but only after changing all my passwords.
As for removing them? Google has a tool to remove them. However if this is not your domain - you may want to email Google and inform them of the malicious happenings.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
NoIndex tag, canonical tag or automatically generated H1's for automatically generated enquiry pages?
What would be better for automatically generated accommodation enquiry pages for a travel company? NoIndex tag, canonical tag, automatically generated H1's or another solution? This is the homepage: https://www.discoverqueensland.com.au/ You would enquire from a page like this: https://www.discoverqueensland.com.au/accommodation/sunshine-coast/twin-waters/the-sebel-twin-waters This is the enquiry form: https://www.discoverqueensland.com.au/accommodation-enquiry.php?name=The+Sebel+Twin+Waters®ion_name=Sunshine+Coast
Technical SEO | | Kim_Lazaro0 -
Pages not indexed
Hey everyone Despite doing the necessary checks, we have this problem that only a part of the sitemap is indexed.
Technical SEO | | conversal
We don't understand why this indexation doesn't want to take place. The major problem is that only a part of the sitemap is indexed. For a client we have several projects on the website with several subpages, but only a few of these subpages are indexed. Each project has 5 to 6 subpages. They all should be indexed. Project: https://www.brody.be/nl/nieuwbouwprojecten/nieuwbouw-eeklo/te-koop-eeklo/ Mainly subelements of the page are indexed: https://www.google.be/search?source=hp&ei=gZT1Wv2ANouX6ASC5K-4Bw&q=site%3Abrody.be%2Fnl%2Fnieuwbouwprojecten%2Fnieuwbouw-eeklo%2F&oq=site%3Abrody.be%2Fnl%2Fnieuwbouwprojecten%2Fnieuwbouw-eeklo%2F&gs_l=psy-ab.3...30.11088.0.11726.16.13.1.0.0.0.170.1112.8j3.11.0....0...1c.1.64.psy-ab..4.6.693.0..0j0i131k1.0.p6DjqM3iJY0 Do you have any idea what is going wrong here?
Thanks for your advice! Frederik
Digital marketeer at Conversal0 -
Why my website does not index?
I made some changes in my website after that I try webmaster tool FETCH AS GOOGLE but this is 2nd day and my new pages does not index www. astrologersktantrik .com
Technical SEO | | ramansaab0 -
Site Indexed but not Cached?
I launched a new website ~2 weeks ago that seems to be indexed but not cached. According to Google Webmaster most of the pages are indexed and I see them appear when I search site:www.xxx.com. However, when I type into the URL - cache:www.xxx.com I get a 404 error page from Google.
Technical SEO | | theLotter
I've checked more established websites and they are cached so I know I am checking correctly here... Why would my site be indexed but not in the cache?0 -
Will rel canonical tags remove previously indexed URLs?
Hello, 7 days ago, we implemented canonical tags to resolve duplicate content issues that had been caused by URL parameters. These "duplicate content" had already been indexed. Now that the URLs have rel canonical tags in place, will Google automatically remove from its index the other URLs with the URL parameters? I ask because we have been tracking the approximate number of URLs indexed by doing a site: search in Google, and we have barely noticed a decrease in URLs indexed. Thanks.
Technical SEO | | yacpro130 -
3 URLS Being Created All For The Same Page
I use wordpress for my blog and for some reason it is creating triple urls for my pages. I am not sure it has always been like this or not. I just noticed it in the errors section of SEO Moz. http://www.kisswedding.com/blog/?gid=7&r=20 http://www.kisswedding.com/blog/ashley-and-daniels-rainy-day-diy-farm-wedding/?gid=7&r=20 http://www.kisswedding.com/blog/ashley-and-daniels-rainy-day-diy-farm-wedding/ It's all the exact same page. Is there something I can do in my settings to make this stop. I don't imagine this is good. Ya think....ha! I saw this is the SEO Moz error area for Missing Title Tags. Apparently the number has gone from 200 to 400 which is weird because I never gave my blog posts meta stuff and I haven't written 200 pages since SEO Moz's last crawl.
Technical SEO | | annasusmiles
Maybe I changed something on my blog settings without even knowing. I can't think for the life of me what that would be though. Thanks so much and I appreciate any help received. Edited to add: I added some plugins over the past week. Maybe it's one of these? Category Text Category SEO Meta Tags (just deactivated this one) PhotoSmash (also deactivated this one) Clicky for WordPress0 -
URL Rewrite
We are trying to convince a client to do a massive rewrite from all URL's looking like this: "www.company.com/category/categoryId=82374" to something like "www.company.com/womens/jackets/rain" How would you describe the importance and impact of doing URL rewrites to an ecommerce site? What evidence/research can we share with them to convince them it is worth the time and effort to do?
Technical SEO | | Hakkasan0 -
URLs: To Change or Not to Change
Hello, We recently launched a redesigned site in Drupal in December of last year. We are an eco-travel company. My current URL's look like this: /africa-and-middle-east/kenya-tanzania /central-south-america/galapagos-islands My pages have good term targeting grades, and the rankings for the terms we are targeting - "kenya and tanzania safaris" and "galapagos islands cruises" are decent, but not great - most are on page 2 or 3. The one URL where I targeted our most important term, "amazon river cruises," I am still on page 2. /central-south-america/amazon-river-cruises My questions are: Did I miss an opportunity with the rest of the URL's, and should I consider changing the rest to more targeted terms with 301s? Since the new site launched in January, perhaps I have not given enough time for my new URL's to index and mature. Would it be easier to set up landing pages with unique article content that targets terms such as "galapagos islands cruises" and "kenya and tanzania safaris"? If so, how can I do it in such a way as to not "compete" with the pages I want to drive them to? This also raises the question of redirecting the same URL twice i.e. I would have 2 redirects in place for the same url e.g. from the former site to the new site, and yet another redirect to the most-recent URL. Is that a problem? Sorry if I've asked too many questions in one post. 😉 Any advice appreciated.
Technical SEO | | csmithal0