Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google tries to index non existing language URLs. Why?
-
Hi,
I am working for a SAAS client. He uses two different language versions by using two different subdomains.
de.domain.com/company for german and en.domain.com for english. Many thousands URLs has been indexed correctly.But Google Search Console tries to index URLs which were never existing before and are still not existing.
de.domain.com**/en/company
en.domain.com/de/**company... and an thousand more using the /en/ or /de/ in between. We never use this variant and calling these URLs will throw up a 404 Page correctly (but with wrong respond code - we`re fixing that
). But Google tries to index these kind of URLs again and again. And, I couldnt find any source of these URLs. No Website is using this as an out going link, etc.
We do see in our logfiles, that a Screaming Frog Installation and moz.com w opensiteexplorer were trying to access this earlier.My Question: How does Google comes up with that? From where did they get these URLs, that (to our knowledge) never existed?
Any ideas? Thanks

-
Hi Hecksler,
Did you ever resolve this?
Quick idea from me is to double check ALL version of your website within Google Search Console. You can now register the entire domain property using DNS: https://searchengineland.com/how-to-set-up-google-search-console-domain-verification-for-site-wide-reporting-data-313256
I found that Google was trying to crawl a very old HTTP sitemap from about five years ago for one of my sites, and thus I was able to delete it.
There's some mixed comments/feeling within the Search Community about whether or not GoogleBot really "guesses" URLs, so it's probably more than likely they are getting the links from somewhere....https://stackoverflow.com/questions/20855082/googlebot-guesses-urls-how-to-avoid-handle-this-crawling
Look forward to hearing from you,
Nick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
301 Redirects, Sitemaps and Indexing - How to hide redirected urls from search engines?
We have several pages in our site like this one, http://www.spectralink.com/solutions, which redirect to deeper page, http://www.spectralink.com/solutions/work-smarter-not-harder. Both urls are listed in the sitemap and both pages are being indexed. Should we remove those redirecting pages from the site map? Should we prevent the redirecting url from being indexed? If so, what's the best way to do that?
Technical SEO | | HeroDesignStudio0 -
Meta Titles and Meta Descriptions are not Indexing in Google
Hello Every one, I have a Wordpress website in which i installed All in SEO plugin and wrote meta titles and descriptions for each and every page and posts and submitted website to index. But after Google crawl the Meta Titles and Descriptions shown by Google are something different that are not found in Content. Even i verified the Cached version of the website and gone through Source code that crawled at that moment. the meta title which i have written is present there. Apart from this, the same URL's are displaying perfect meta titles and descriptions which i wrote in Yahoo and Bing Search Engines. Can anyone explain me how to resolve this issue. Website URL: thenewyou (dot) in Regards,
Technical SEO | | SatishSEOSiren0 -
Vanity URLs are being indexed in Google
We are currently using vanity URLs to track offline marketing, the vanity URL is structured as www.clientdomain.com/publication, this URL then is 302 redirected to the actual URL on the website not a custom landing page. The resulting redirected URL looks like: www.clientdomain.com/xyzpage?utm_source=print&utm_medium=print&utm_campaign=printcampaign. We have started to notice that some of the vanity URLs are being indexed in Google search. To prevent this from happening should we be using a 301 redirect instead of a 302 and will the Google index ignore the utm parameters in the URL that is being 301 redirect to? If not, any suggestions on how to handle? Thanks,
Technical SEO | | seogirl221 -
Best practice for URL - Language/country
Hi, We are planning on having our website localized into more languages. We already have an English and German version. The German version is currently a sub-domain: www.example.com --> English version de.example.com --> German version Is this recommended? Or is it always better to have URLs with language prefixes such a: www.example.com/de www.example.com/es Which is a better practice in terms of SEO?
Technical SEO | | Kilgray1 -
Why google indexed pages are decreasing?
Hi, my website had around 400 pages indexed but from February, i noticed a huge decrease in indexed numbers and it is continually decreasing. can anyone help me to find out the reason. where i can get solution for that? will it effect my web page ranking ?
Technical SEO | | SierraPCB0 -
How to fix Google index after fixing site infected with malware.
Hi All Upgraded a Joomla site for a customer a couple of months ago that was infected with malware (it wasn't flagged as infected by google). Site is fine now but still noticing search queries for "cheap adobe" etc with links to http://domain.com/index.php?vc=201&Cheap_Adobe_Acrobat_xi in web master tools (about 50 in total). These url's redirect back to home page and seem to be remaining in the index (I think Joomla is doing this automatically) Firstly, what sort of effect would these be having on on their rankings? Would they be seen by google as duplicate content for the homepage (moz doesn't report them as such as there are no internal links). Secondly what's my best plan of attack to fix them. Should I setup 404's for them and then submit them to google? Will resubmitting the site to the index fix things? Would appreciate any advice or suggestions on the ramifications of this and how I should fix it. Regards, Ian
Technical SEO | | iragless0 -
Having www. and non www. links indexed
Hey guys, As the title states, the two versions of the website are indexed in Google. How should I proceed? Please also note that the links on the website are without the www. How should I proceed knowing that the client prefers to have the www. version indexed. Here are the steps that I have in mind right now: I set the preferred domain on GWMT as the one with www. I 301 redirect any non www. URL to the www. version. What are your thoughts? Should I 301 redirect the URL's? or is setting the preference on GWMT enough? Thanks.
Technical SEO | | BruLee0