Sudden Increase In Number of Pages Indexed By Google Webmaster When No New Pages Added
-
Greetings MOZ Community:
On June 14th Google Webmaster tools indicated an increase in the number of indexed pages, going from 676 to 851 pages. New pages had been added to the domain in the previous month. The number of pages blocked by robots increased at that time from 332 (June 1st) to 551 June 22nd), yet the number of indexed pages still increased to 851.
The following changes occurred between June 5th and June 15th:
-A new redesigned version of the site was launched on June 4th, with some links to social media and blog removed on some pages, but with no new URLs added. The design platform was and is Wordpress.
-Google GTM code was added to the site.
-An exception was made by our hosting company to ModSecurity on our server (for i-frames) to allow GTM to function.
In the last ten days my web traffic has decline about 15%, however the quality of traffic has declined enormously and the number of new inquiries we get is off by around 65%. Click through rates have declined from about 2.55 pages to about 2 pages.
Obviously this is not a good situation.
My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline.
My developer is examining the issue. They think there may be some tie in with the installation of GTM. They are noticing an additional issue, the sites Contact Us form will not work if the GTM script is enabled. They find it curious that both issues occurred around the same time.
Our domain is www.nyc-officespace-leader. Does anyone have any idea why these extra pages are appearing and how they can be removed? Anyone have experience with GTM causing issues with this?
Thanks everyone!!!
Alan -
Yes, and I appreciate it!
Alan -
I did what I asked you to do.
-
-
-
- in my first post and repeated frequently.
-
-
-
-
Hi Egol:
How did you locate this duplicate or re-published content?
Obviously what you have pointed out is a major source of concern so I ran Copyscape search this afternoon for duplicate content and did not locate any the URLs you mention in the "this", "this" link above. It appears you entered the URL of the blog post in Google's search bar. Would that work? This method would be pretty slow going with 600 URLs.
Thanks,
Alan -
Those are the 448 URLs from your website that have been filtered.
You should find garbage in them like shown below.
Have you done what I have suggested three times above? Do that if you want to identify the problem pages.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
-
Hi Egol:
Thanks for the suggestion.
When I click on _ repeat the search with the omitted results included _I get 448 results not the entire 859 results. Seems very strange. Some of these URLS have light content but I don't believe they are dups. I don't see any content outside our website when I click this.
Am I doing something wrong? I would think the total of 859 would appear not 447 URLs.
Thanks!!
Alan -
I don't know. You should ask someone who knows a lot about canonicalization.
Did you drill down through all of those indexed pages to see if you can identify all of them?
I've suggested it twice.
-
Hi Egol:
In the content of launching an upgraded site, could the canonicalization have implemented incorrectly? That could account for 175 pages sudden new content as the thin content has been there for some time.
I am particularly suspicious regarding canonicalization as there was an issue involving multi page URLs of property listings when the site was migrated from Drupal to Wordpress last Summer.
Thoughts?
Thanks, Alan
-
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You can have an attorney demand that they stop, you can file DMCA complaints. Be careful
**However it does not explain the sudden appearance of the 175 pages on Googles index **
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
Get a spreadsheet that has all of your URLs. Drill down through the SERPs checking every one of them. Can you account for your pagination. You have a lot of it and that type of page is usually rubbish in the index. Combine, canonicalize, or get rid of them.
-
-
Hi Egol:
Thanks so much for taking the time for your thorough response!!
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You have pointed out something very useful and I appreciate it and will act upon it. However it does not explain the sudden appearance of the 175 pages on Googles index that did not appear at the end of May and somehow coincided with uploading of the new version of our website in early June. Any ideas???
Thanks,
Alan -
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
When you drill down about 44 pages you will find this...
In order to show you the most relevant results, we have omitted some entries very similar to the 440 already displayed.
If you like, you can repeat the search with the omitted results included.The bad stuff is usually behind that link. Google doesn't want to show that stuff to people. It could be thin, it could be duplicate, it could be spammy, they just might not like it.
- Find out what is in there.
Possible problems that I see....
I see dupe content like this and this. Either your guys are grabbin' somebodyelse's content or they are grabbin' yours. Can get you in trouble with Panda. You need original and unique. Anything that is not original and unique should be deleted, noindexed or rewritten.
A lot of these pages are really skimpy. Think content can get you into trouble with Panda. Anything that is skimpy should be deleted, noindexed or beefed up.
I see multiple links to tags on lots of these posts. That can cause duplicate content problems.
The tag pages are paginated with just a few pages on each. These can generate extra pages that are low value, suck up your linkjuice or compound duplicate content problems.
You have archive pages, and category pages and more pagination problems.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New Site Worries
To cut a long story short, our old web developers who built us a bespoke site decided that they could no longer offer us support so we decided to move our back end to the latest Magento 2 software and move over to https with a new company. The new setup has been live for 3 weeks, I have checked in webmaster tools and it says we have 4 pages indexed, if I type in site:https://www.mydomain.com/ we have 6560 pages indexed, our robots.txt file looks like this:Sitemap: https://www.mydomain.com/sitemap.xml Sitemap: https://www.mydomain.com/sitemaps/sitemap_default.xml I use Website Auditor and Screaming Frog, Website Auditor returns a 302 for my domain and Screaming Frog returns a 403 which means I cannot scan any of these. If I check my domain using an https checking tool some sites return an error but some return a 200.
Reporting & Analytics | | Palmbourne
I have spoken to my new developer and he says everything is fine, in Webmaster tools I can see some redirects from his domain to mine when the site was in testing mode. I am concerned that something is not right as I always check my pages on a regular basis. Can anyone shed any light on this, is it right or am I right to be concerned. Thank you in advance0 -
When Deployment done for my site in google analytic page views little bit affected
Hello Expert, For my ecommerce site every minute I have 500 visitors now when my development team do deployment like minor JS update where site restart not require then what I found is on website (front end) not able to find anything happen that means visitor not event realize page even refresh but in analytic real time it shows minor pageviews down. So can anyone guess here what happen actually in this case as per google? Thanks!
Reporting & Analytics | | micey1230 -
What determines the page order of site:domain?
Whenever I use site:domain.com to check what's index, it's pretty much always in the same order. I gather from this, the order is not random. I'm also reasonably certainly it isn't related to any page strength signals or ranking results. So, does anyone know why the pages are displayed in the order they are? What information does the order of the pages tell me? Thanks, Ruben
Reporting & Analytics | | KempRugeLawGroup1 -
Google Ad referral
I was wondering if someone could decode the jumble of a referral - this is supposedly the referal that led to a click through to my site via a product listing ad. I am trying to figure out how www.nextag.com comes in to the picture as we do not have refurbexperts even listed there? Thanks to anyone who tries/does work it out. http://www.googleadservices.com/pagead/aclk?sa=L&ai=CGXud6DmDU_qeL5THygHpuICwCaTZwMYD_Nvvv0bEwMS50wEIBhAEIOn5-gEoBVCl7P7f-v____8BYMnu8omYpPQSoAHAhIv9A8gBB8gDG6oEJ0_QwcNc5zNun_d7S5KNcMT6uPjjH_mMDkKFFgBCQ6aKICRPJVVa7MAFBYgGAaAGJoAHqPv0ApAHAeASupqdo-ypit0m&ohost=www.google.com&cid=5GhZEzUCSC6x9n2wxOdz3-mrAfSUkvHKPN3wD5yLInnlNil_&sig=AOD64_1D1z1JPYbFP0UnUglJVOfvd25RfA&adurl=http://refurbexperts.com/product/527/HP-LaserJet-P2015-Laser-Printer-RECONDITIONED%3Futm_source%3Dproductlistingads%26utm_medium%3Dadwords%26utm_campaign%3Dadwords&ctype=5&nb=0&res_url=http%3A%2F%2Fwww.nextag.com%2Fhp-p2015-laserjet%2Fproducts-html%3Fnxtg%3D116d0a1c0504-9FFEB16DE52A7E2A&rurl=http%3A%2F%2Fwww.nextag.com%2Fgoto.jsp%3Fp%3D3652%26search%3Dhp%2520p2015%2520laserjet%26t%3Dag%253D1384181795%26crid%3D48271786%26gg_aid%3D20169721025%26gg_site%3D%26gclid%3DCjgKEAjwzIucBRDzjIz9qMOB3TASJABBIwL1LHK7GcAPS6yHGpd9Kq3wsZrcPORAWD8QCWivr4W75PD_BwE&nm=11&nx=43&ny=12&is=700x181&clkt=187
Reporting & Analytics | | henya0 -
Does Google encryption of keyword data impact SEO revenue reporting in Google analytics?
Hi there, I know Google has been encrypting SEO keyword data which they rolled out in September 2013. My question is - will this impact SEO revenue figures reported in Google analytics? I have been monitoring SEO revenue figures for a client and they are significantly down even though rankings have not lowered. Is this because of Google's encryption? Could there be another reason? Many thanks!
Reporting & Analytics | | CayenneRed890 -
Domain Sub folder tracking from Google webmaster tool
I have created separate google webmaster access for this sub folder http://lineonebus.com/content/ Original domain http://lineonebus.com/ Search results in webmaster shows no search queries though this subfolder created in wordpress is ranking for some keywords.
Reporting & Analytics | | csfarnsworth1 -
Google analytics reality check?
Looking back over a 9 month period tracking analytics with getclicky my site showed a 29% bounce rate, with only about 1/4 of visitors spending 1 minute or less on my site. I've recently implemented GA (removed old clicky code) and although traffic is strong, my site now shows a bounce rate of about 82%. Engagement stats also show that 82% of visitors spend between 0-10 seconds on my site. My site is built on Wordpress and the GA tracking code wasn't placed directly in the footer, my developer built a field in the admin area to insert the UA number which automatically adds the code to all pages. I've checked the code and the tracking seems to appear on all pages. I took a look at AW Stats. It corroborates GA and says that 80% of visitors are spending 0-30 seconds on the site. Potential issues/clues: browser tests show small loading problems in Internet Explorer 7,8,9 (the phone number at the top of the header loads on the wrong side of the page) and major issues in Internet Explorer 6 (site doesn't load at all in IE 6). The thing is no one who uses IE 6 is coming to the site. Second, the site gets a grade of C in YSlow, it's not lightning fast at the moment. GA is showing average page load of 2.4 seconds, but don't think either of these issues should cause an 82% 0-10 seconds engagement number. My site is content rich/focused with very minimal advertising. Content is accessible well above the fold. My question: Does the fact that AW Stats and GA agree mean that those numbers are accurate, or is there a bug I should be looking for? How to explain the clicky numbers?
Reporting & Analytics | | JSOC0 -
Why do I have few different index URL addresses?
Yes I know, sorry guys but I also have a problem with duplicate pages. It shows that almost every page of my site has a duplicate content issue and looking at my folders in the server, I don't see all these pages... This is a static Website with no shopping cart or anything fancy. The first on the list is my [index] page and this is giving me a hint about some sort of bad settings on my end with the SEOMOZ crawler??? Please advice and thank you! index-variations.jpg
Reporting & Analytics | | cssyes0