Sudden increase in number of indexed URLs. How ca I know what URLs these are?
-
We saw a spike in the total number of indexed URLs (17,000 to 165,000)--what would be the most efficient way to find out what the newly indexed URLs are?
-
You can also try searching your URL in google using exact paramaters...
inurl:http://www.yourdomain.com
or if you dont have a www.
inurl:http://yourdomain.com
-
The rest are most likely put into the supplemental index (are duplicates). I'd review GWT and analyze crawl stats, crawl errors, and html improvements.
Do you have 165k pages on your site? If not, it's probably some sort of error.
-
Hi Oleg,
Just tried that, but, it is only showing 300 URLs for the past week and 600 for the past month..
-
You can use the SEOmoz OSE you can use your google webmaster tools and you can also use Majestic . 17,000 to 165,000 indexed pages is a huge jump.
-
Search site:yourdomain.com, click on "Search Tools" in the left column and choose "Last week" (or within whatever period you saw the increase in indexed pages).
Or, you can choose "Past Year" and then choose "Sort by Date" - this will give you the latest links indexed in order.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has discovered a URL but won't index it?
Hey all, have a really strange situation I've never encountered before. I launched a new website about 2 months ago. It took an awfully long time to get index, probably 3 weeks. When it did, only the homepage was indexed. I completed the site, all it's pages, made and submitted a sitemap...all about a month ago. The coverage report shows that Google has discovered the URL's but not indexed them. Weirdly, 3 of the pages ARE indexed, but the rest are not. So I have 42 URL's in the coverage report listed as "Excluded" and 39 say "Discovered- currently not indexed." When I inspect any of these URL's, it says "this page is not in the index, but not because of an error." They are listed as crawled - currently not indexed or discovered - currently not indexed. But 3 of them are, and I updated those pages, and now those changes are reflected in Google's index. I have no idea how those 3 made it in while others didn't, or why the crawler came back and indexed the changes but continues to leave the others out. Has anyone seen this before and know what to do?
Intermediate & Advanced SEO | | DanDeceuster0 -
Google not indexing images
Hi there, We have a strange issue at a client website (www.rubbermagazijn.nl). Webpage are indexed by Google but images are not, and have never been since the site went live in '12 (We recently started SEO work on this client). Similar sites like www.damenrubber.nl are being indexed correctly. We have correct robots and sitemap setup and directions. Fetch as google (Search Console) shows all images displayed correctly (despite scripted mouseover on the page) Client doesn't use CDN Search console shows 2k images indexed (out of 18k+) but a site:rubbermagazijn.nl query shows a couple of images from PDF files and some of the thumbnails, but no productimages or category images from homepage. (product page example: http://www.rubbermagazijn.nl/collectie/slangen/olie-benzineslangen/7703_zwart_nbr-oliebestendig-6mm-l-1000mm.html) We've changed the filenames from non-descriptive names to descriptive names, without any result. Descriptive alt texts were added We're at a loss. Has anyone encountered a similar issue before, and do you have any advice? I'd be happy to provide more information if needed. CBqqw
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
How to de-index old URLs after redesigning the website?
Thank you for reading. After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain). It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?) What is the best way to deal with this issue?
Intermediate & Advanced SEO | | Chemometec0 -
Woo Commerce Woo Compare Urls Indexing?
Hi I am using Wordpress/Woo commerce for my site Thetotspot.co.uk http://www.thetotspot.co.uk/?action=yith-woocompare-add-product&id=1412&_wpnonce=a5560b1b07 But I am getting a lot of temporary redirects registering in Moz for things like the above - woo compare / add to cart links Anyone come across this - how did you get solve? I am using Yoast SEO currently, have no indexed archives and pages of archive etc.
Intermediate & Advanced SEO | | Kelly33300 -
URL Structure Question
Am starting to work with a new site that has a domain name contrived to help it with a certain kind of long tail search. Just for fictional example sake, let's call it WhatAreTheBestRestaurantsIn.com. The idea is that people might do searches for "what are the best restaurants in seattle" and over time they would make some organic search progress. Again, fictional top level domain example, but the real thing is just like that and designed to be cities in all states. Here's the question, if you were targeting searches like the above and had that domain to work with, would you go with... whatarethebestrestaurantsin.com/seattle-washington whatarethebestrestaurantsin.com/washington/seattle whatarethebestrestaurantsin.com/wa/seattle whatarethebestrestaurantsin.com/what-are-the-best-restaurants-in-seattle-wa ... or what and why? Separate question (still need the above answered), would you rather go with a super short (4 letter), but meaningless domain name, and stick the longtail part after that? I doubt I can win the argument the new domain name, so still need the first question answered. The good news is it's pretty good content. Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
Increasing index
Hi! I'm having some trouble getting Google to index pages which once had a querystring in them but now are being redirected with a 301. The pages have a lot of unique content but this doesn't seem to matter. I feels as if there stuck in limbo (or a sandbox 🙂 Any clues on how to fix this? Thanks / Niklas
Intermediate & Advanced SEO | | KAN-Malmo0 -
Googlebot crawling partial URLs
Hi guys, I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL. Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'. This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders. Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening.. Thanks!
Intermediate & Advanced SEO | | panini0