URL indexed but not submitted in sitemap, however the URL is in the sitemap
-
Dear Community, I have the following problem and would be super helpful if you guys would be able to help. Cheers
-
Symptoms :
-
On the search console, Google says that some of our old URLs are indexed but not submitted in sitemap
-
However, those URLs are in the sitemap
-
Also the sitemap as been successfully submitted. No error message
-
Potential explanation :
-
We have an automatic cache clearing process within the company once a day. In the sitemap, we use this as last modification date. Let's imagine url www.example.com/hello was modified last time in 2017. But because the cache is cleared daily, in the sitemap we will have last modified : yesterday, even if the content of the page did not changed since 2017.
-
We have a Z after sitemap time, can it be that the bot does not understands the time format ?
-
We have in the sitemap only http URL. And our HTTPS URLs are not in the sitemap
What do you think?
-
-
Hi there,
I can't answer all of your questions but Google literally announced we can delete old sitemaps in new search console now: https://www.searchenginejournal.com/google-updates-the-sitemaps-report-in-search-console-adds-ability-to-delete-sitemaps/299495/
With this feature available, there's definitely more opportunities to test a few more sitemap submissions and to verify that all urls have been crawled.
If you could cross-reference this with serverlogs you would definitely be on to a winner; although to be fair Googlebot crawling a URL doesn't automatically mean indexation!
Good luck,
Nick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Specific page does not index
Hi, First question: Working on the indexation of all pages for a specific client, there's one page that refuses to index. Google Search console says there's a robots.txt file, but I can't seem to find any tracks of that in the backend, nor in the code itself. Could someone reach out to me and tell me why this is happening? The page: https://www.brody.be/nl/assistentiewoningen/ Second question: Google is showing another meta description than the one our client gave in in Yoast Premium snippet. Could it be there's another plugin overwriting this description? Or do we have to wait for it to change after a specific period of time? Hope you guys can help
Intermediate & Advanced SEO | | conversal0 -
Old product URLs still indexed and maybe causing problems?
Hi all, Need some expertise here: We recently (3 months ago) launched a newly updated site with the same domain. We also added an SSL and dropped the www (with proper redirects). We went from http://www.mysite.com to https://mysite.com. I joined the company about a week after launch of the new site. All pages I want indexed are indexed, on the sitemap and submitted (submitted in July but processes regularly). When I check site:mysite.com everything is there, but so are pages from the old site that are not on the sitemap. These do have 301 redirects. I am finding our non-product pages are ranking with no problem (including category pages) but our product pages are not, unless I type in the title almost exactly. We 301 redirected all old urls to new comparable product, or if the product is not available anymore to the home page. For better or worse, as it turns out and prior to my arrival, in building the new site the team copied much of the content (descriptions, reviews, etc) from the old site to create the new product pages. After some frustration and research I am finding the old pages are still indexed and possibly causing a duplicate content issue. Now, I gather there is supposedly no "penalty", per se, for duplicate content but a page or site will simply not show in the SERPs. Understandable and this seems to be the case. We also sell a lot of product wholesale and it turns out many dealers are using the same descriptions we have (and have had) on our site. Some are much larger than us so I'd expect to be pushed down a bit but we don't even show in the top 10 pages...for our own product. How long will it take for Google to drop the old and rank the new as unique? I have re-written some pages but much is technical specifications and tough to paraphrase or re-write. I know I could do this in Search Console but I don't have access to the old site any longer. Should I remove the 301s a few at a time and see if the old get dropped faster? Maybe just re-write ALL the content? Wait? As a site note, I'm also on a Drupal CMS with a Shopify ecommerce module so maybe the shop.mysite.com vs mysite.com is throwing it off with the products(?) - (again the Drupal non-product AND category pages rank fine). Thoughts on this would be much appreciated. Thx so much!
Intermediate & Advanced SEO | | mcampanaro0 -
¿Disallow duplicate URL?
Hi comunity, thanks for answering my question. I have a problem with a website. My website is: http://example.examples.com/brand/brand1 (good URL) but i have 2 filters to show something and this generate 2 URL's more: http://example.examples.com/brand/brand1?show=true (if we put 1 filter) http://example.examples.com/brand/brand1?show=false (if we put other filter) My question is, should i put in robots.txt disallow for these filters like this: **Disallow: /*?show=***
Intermediate & Advanced SEO | | thekiller990 -
Capitals in URLs
Hello Mozzers. I've just been looking at a site with capitals in the URL - capitals are used in the product descriptions, so you'll have a URL structure like this: www.company.com/directory1/Double-Beds-Luxury (such URLs do not work if I lower the case of the capitals). There are 50,000 such products on the site. Clearly one drawback is potential customers might type in, or link to, the lower case of the URL and get a "not found" result (though the urls are relatively long so not that likely I'm thinking). Are there any additional drawbacks with the use of capitals outlined here?
Intermediate & Advanced SEO | | McTaggart0 -
Canonical url question
i just search seomoz tooll it say duplicate content for www.mysite.com and www.mysite.com/index.php should i use canonical url for this ? is yes then is this right ?
Intermediate & Advanced SEO | | constructionhelpline0 -
Should I 301 Poorly Worded URL's which are indexed and driving traffic
Hi, I'm working on our sites structure and SEO at present and wondering when the benefit I may get from a well written URL, i.e ourDomain / keyword or keyphrase .html would be preferable to the downturn in traffic i may witness by 301 redirecting an existing, not as well structured, but indexed URL. We have a number of odd looking URL's i.e ourDomain / ourDomain_keyword_92.html alongside some others that will have a keyword followed by 20 underscores in a long line... My concern is although i would like to have a keyword or key phrase sitting on its own in a well targeted URL string I don't want to mess to much with pages that are driving say 2% or 3% of our traffic just because my OCD has kicked in.... Some further advice on strategies i could utilise would be great. My current thinking is that if a page is performing well then i should leave the URL alone. Then if I'm not 100% happy with the keyword or phrase it is targeting I could build another page to handle the new keyword / phrase with the aim of that moving up the rankings and eventually taking over from where the other page left off. Any advice is much appreciated, Guy
Intermediate & Advanced SEO | | guycampbell0 -
Is it OK to have a site that has some URLs with hyphens and other, older, legacy URLs that use underscores?
I'm working with a VERY large site that has recently been redesigned/recategorized. They kept only about 20% of the URLs from the legacy site, the URLs that had revenue tied to them, and these URLs use underscores. Whereas the new URLs created for the site use hyphens. I don't think that this would be an issue for Google, as long as the pages are of quality, but I wanted to get everyone's opinion on this. Will it hurt me to have two different sets of URLs, those with using hyphens and those using underscores?
Intermediate & Advanced SEO | | Business.com0 -
Removing pages from index
Hello, I run an e-commerce website. I just realized that Google has "pagination" pages in the index which should not be there. In fact, I have no idea how they got there. For example, www.mydomain.com/category-name.asp?page=3434532
Intermediate & Advanced SEO | | AlexGop
There are hundreds of these pages in the index. There are no links to these pages on the website, so I am assuming someone is trying to ruin my rankings by linking to the pages that do not exist. The page content displays category information with no products. I realize that its a flaw in design, and I am working on fixing it (301 none existent pages). Meanwhile, I am not sure if I should request removal of these pages. If so, what is the best way to request bulk removal. Also, should I 301, 404 or 410 these pages? Any help would be appreciated. Thanks, Alex0