Sudden Increase In Number of Pages Indexed By Google Webmaster When No New Pages Added
-
Greetings MOZ Community:
On June 14th Google Webmaster tools indicated an increase in the number of indexed pages, going from 676 to 851 pages. New pages had been added to the domain in the previous month. The number of pages blocked by robots increased at that time from 332 (June 1st) to 551 June 22nd), yet the number of indexed pages still increased to 851.
The following changes occurred between June 5th and June 15th:
-A new redesigned version of the site was launched on June 4th, with some links to social media and blog removed on some pages, but with no new URLs added. The design platform was and is Wordpress.
-Google GTM code was added to the site.
-An exception was made by our hosting company to ModSecurity on our server (for i-frames) to allow GTM to function.
In the last ten days my web traffic has decline about 15%, however the quality of traffic has declined enormously and the number of new inquiries we get is off by around 65%. Click through rates have declined from about 2.55 pages to about 2 pages.
Obviously this is not a good situation.
My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline.
My developer is examining the issue. They think there may be some tie in with the installation of GTM. They are noticing an additional issue, the sites Contact Us form will not work if the GTM script is enabled. They find it curious that both issues occurred around the same time.
Our domain is www.nyc-officespace-leader. Does anyone have any idea why these extra pages are appearing and how they can be removed? Anyone have experience with GTM causing issues with this?
Thanks everyone!!!
Alan -
Yes, and I appreciate it!
Alan -
I did what I asked you to do.
-
-
-
- in my first post and repeated frequently.
-
-
-
-
Hi Egol:
How did you locate this duplicate or re-published content?
Obviously what you have pointed out is a major source of concern so I ran Copyscape search this afternoon for duplicate content and did not locate any the URLs you mention in the "this", "this" link above. It appears you entered the URL of the blog post in Google's search bar. Would that work? This method would be pretty slow going with 600 URLs.
Thanks,
Alan -
Those are the 448 URLs from your website that have been filtered.
You should find garbage in them like shown below.
Have you done what I have suggested three times above? Do that if you want to identify the problem pages.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
-
Hi Egol:
Thanks for the suggestion.
When I click on _ repeat the search with the omitted results included _I get 448 results not the entire 859 results. Seems very strange. Some of these URLS have light content but I don't believe they are dups. I don't see any content outside our website when I click this.
Am I doing something wrong? I would think the total of 859 would appear not 447 URLs.
Thanks!!
Alan -
I don't know. You should ask someone who knows a lot about canonicalization.
Did you drill down through all of those indexed pages to see if you can identify all of them?
I've suggested it twice.
-
Hi Egol:
In the content of launching an upgraded site, could the canonicalization have implemented incorrectly? That could account for 175 pages sudden new content as the thin content has been there for some time.
I am particularly suspicious regarding canonicalization as there was an issue involving multi page URLs of property listings when the site was migrated from Drupal to Wordpress last Summer.
Thoughts?
Thanks, Alan
-
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You can have an attorney demand that they stop, you can file DMCA complaints. Be careful
**However it does not explain the sudden appearance of the 175 pages on Googles index **
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
Get a spreadsheet that has all of your URLs. Drill down through the SERPs checking every one of them. Can you account for your pagination. You have a lot of it and that type of page is usually rubbish in the index. Combine, canonicalize, or get rid of them.
-
-
Hi Egol:
Thanks so much for taking the time for your thorough response!!
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You have pointed out something very useful and I appreciate it and will act upon it. However it does not explain the sudden appearance of the 175 pages on Googles index that did not appear at the end of May and somehow coincided with uploading of the new version of our website in early June. Any ideas???
Thanks,
Alan -
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
When you drill down about 44 pages you will find this...
In order to show you the most relevant results, we have omitted some entries very similar to the 440 already displayed.
If you like, you can repeat the search with the omitted results included.The bad stuff is usually behind that link. Google doesn't want to show that stuff to people. It could be thin, it could be duplicate, it could be spammy, they just might not like it.
- Find out what is in there.
Possible problems that I see....
I see dupe content like this and this. Either your guys are grabbin' somebodyelse's content or they are grabbin' yours. Can get you in trouble with Panda. You need original and unique. Anything that is not original and unique should be deleted, noindexed or rewritten.
A lot of these pages are really skimpy. Think content can get you into trouble with Panda. Anything that is skimpy should be deleted, noindexed or beefed up.
I see multiple links to tags on lots of these posts. That can cause duplicate content problems.
The tag pages are paginated with just a few pages on each. These can generate extra pages that are low value, suck up your linkjuice or compound duplicate content problems.
You have archive pages, and category pages and more pagination problems.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Query on Google Experiment?
Hello All, I am doing A/B testing via google analytic experiment now my query is for my ecommerce site homepage i am trying to add newsletter in 3 different way so it will be variation A, Variation B and Variation C. So my query is what should be then original page? Currently there is no newsletter form on homepage. Do I consider original page as Variation A? i.e. abcd.com?variation-A as original page? Then google will decide winner between original page i.e. variation A and Variation B & C Thanks!
Reporting & Analytics | | dsouzac0 -
Rel=Canonical vs. No Index
Ok, this is a long winded one. We're going to spell out what we've seen, then give a few questions to answer below, so please bear with us! We have websites with products listed on them and are looking for guidance on whether to use rel=canonical or some version of No Index for our filtered product listing pages. We work with a couple different website providers and have seen both strategies used. Right now, one of our web providers uses No Index, No Follow tags and Moz alerted us to the high frequency of these tags. We want to make sure our internal linking structure is sound and we are worried that blocking these filtered pages is keeping our product pages from being as relevant as they could be. We've seen recommendations to use No Index, Follow tags instead, but our other web provider uses a different method altogether. Another vendor uses a rel=canonical strategy which we've also seen when researching Nike and Amazon's sites. Because these are industry leading sites, we're wondering if we should get rid of the No Index tags completely and switch to the canonical strategy for our internal links. On that same provider's sites, we've found rel=canonical tags used after the first page of our product listings, and we've seen recommendations to use rel=prev and rel=next instead. With all that being said, we have three questions: 1)Which strategy (rel=canonical vs. No Index) do you recommend as being optimal for website crawlers and boosting our site relevance? 2)If we should be using some version of No Index, should we use Follow or No Follow? 2)Depending on the product, we have multiple pages of products for each category. Should we use rel=prev & rel=next instead of rel=canonical among the pages after page one? Thanks in advance!
Reporting & Analytics | | Leithmarketing0 -
Tracking Google places (7 pack listing) traffic in google analytics
Is there a way to see Google Places traffic (traffic from users clicking through the 7 pack listings) segmented in Google analytics ? Normally is it just lumped together with the organic traffic ? Can you see the search phrases used to find your site, or do they also show up under 'not provided' when from Google Places. Im aware i can see some limited data in the Google Places analytics, but these seem to be 2 days behind when ever i view them.
Reporting & Analytics | | Sam-P0 -
Webmaster Tools Indexed pages vs. Sitemap?
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed. My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap? Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
Reporting & Analytics | | IrvCo_Interactive1 -
Advanced Segment on Google Analytics
Hello there, hope everyone is allright and rockin' the SEO world 🙂 Was wondering if anyone could give a tip on how to configure an 'Advanced Segment' on Google Analytics. Basically I need to isolate traffic for 4 specific subfolders. E.g. www.mywebsite.com/solutions/A www.mywebsite.com/solutions/B www.mywebsite.com/solutions/C www.mywebsite.com/solutions/D/part1 Please note that the website has more pages under the specific section. E.g www.mywebsite.com/solutions/Z www.mywebsite.com/solutions/D/part2 but I only need to isolate the 4 directories (and their own sub-folders) mentioned above. Any idea how I could do this? Thanks a lot Joe
Reporting & Analytics | | Joseph.Volcy0 -
Google Analytics - my continuing adventures
Hello I'd appreciate views of the various metrics I'm struggling with in GA: I've run 2 different reports that provide 2 different outputs. 1. In Standard Reporting you can report in Traffic Sources on Organic Search by Keyword, which returns the number of Visits. 2. In Custom Reporting you can define the Keyword dimension and the Organic Searches metric, which returns the number of Organic Searches. This returns 2 different numbers. For example, over the last month for a given term report 1 returns 77,306 visits whilst report 2 returns 52,589 organic searches. I have found some definitions: "Visits represent the number of individual sessions initiated by all the visitors to your site." "Organic Searches: number of organic searches that happened within a session. This metric is search engine agnostic." My understanding of these definitions is that report 2 should return a larger value than report 1 rather than what is happening (i.e. report 1 returns a greater value than report 2). Does anyone have a greater understanding of what these mean and relate to? Does anyone have any views on which metric is more useful? Thanks Neil
Reporting & Analytics | | mccormackmorrison0 -
Strange increase in Direct traffic in Google analytics
For past 2 weeks, several of our sites have strange increase in direct traffic in Google Analytics. we also have another tracking code, and in this account we don't have any big changes, so this is very strange what is happening. We didn't changed any codes, and none of the changes were done to application. Any ideas why this is happening? z7ME9.jpg
Reporting & Analytics | | InformMedia0 -
Google and bing search filed commands
Dose someone have / know a full list / resource with commands for google and bing ? Including filters for those commands ? (site:domain.com -filter etc) (like: site:domain.com, link:domain.com etc) I use the basic ones b ut I know there are much more and that there are several filters that can be used with success to filter down results. Thanks.
Reporting & Analytics | | eyepaq1