Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Getting google impressions for a site not in the index...
-
Hi all
Wondering if i could pick the brains of those wise than myself... my client has an https website with tons of pages indexed and all ranking well, however somehow they managed to also set their server up so that non https versions of the pages were getting indexed and thus we had the same page indexed twice in the engine but on slightly different urls (it uses a cms so all the internal links are relative too). The non https is mainly used as a dev testing environment.
Upon seeing this we did a google remove request in WMT, and added noindex in the robots and that saw the index pages drop over night. See image 1. However, the site still appears to getting return for a couple of 100 searches a day! The main site gets about 25,000 impressions so it's way down but i'm puzzled as to how a site which has been blocked can appear for that many searches and if we are still liable for duplicate content issues.
Any thoughts are most welcome. Sorry, I am unable to share the site name i'm afraid. Client is very strict on this.
Thanks,
Carl
-
Hi Chris
Thanks for the reply.
I think i confused myself with terms. Meant added a noindex to the header of the pages in the relevant tags. We removed the urls from WMT which usually drops them all from the engine in a matter of hours but have read that sometimes this can expire so we put the noindex tag in place incase the WMT did happen to expire and the pages started to get indexed again.
Regards
Carl
-
Carl,
I'm wondering what you mean by "added noindex in the robots".
If you mean you disallowed those pages in the robots.txt file, that won't be enough to remove or keep them removed from the index. Typically, the robots meta tag
[](https://support.google.com/webmasters/answer/93710?hl=en)
is used to keep the pages out of the index. And if you use the robots meta tag on those pages, do not use the robots.txt file to disallow bots from those pages, as that will prevent bots from viewing their meta data.
-
Sorry, it appeared I could only upload 1 image in the first post so here is the second image.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Whatstuffwherebot user agent messing up Google Analytics
Starting yesterday, Aug 26, 2020, I noticed a new bot crawling our site with user agent whatstuffwherebot. Google Analytics is counting these hits as human traffic, completely throwing off my numbers - yesterday, Analytics reported nearly triple my typical number of visitors. As of now, Search Console only shows data through Aug 25 so I don't know if Search Console is also affected. Is anybody else seeing something similar? Does anybody know what the whatstuffwherebot bot is? I don't get any results when I search on Google or Bing. For what it's worth, the traffic is coming from Columbus, OH, running over Amazon AWS via 278 different IP addresses so far. Also, WordFence (my WordPress security plugin) correctly identifies these hits as bot traffic.
Reporting & Analytics | | ahirai0 -
PDF best practices: to get them indexed or not? Do they pass SEO value to the site?
All PDFs have landing pages, and the pages are already indexed. If we allow the PDFs to get indexed, then they'd be downloadable directly from google's results page and we would not get GA events. The PDFs info would somewhat overlap with the landing pages info. Also, if we ever need to move content, we'd now have to redirects the links to the PDFs. What are best practices in this area? To index or not? What do you / your clients do and why? Would a PDF indexed by google and downloaded directly via a link in the SER page pass SEO juice to the domain? What if it's on a subdomain, like when hosted by Pardot? (www1.example.com)
Reporting & Analytics | | hlwebdev1 -
Redirecting all URLs appended with index.htm or index.html
It has come to my attention with one of my clients (WordPress website) that for some time they have within their Landing Page report (of GA - Google Analytics) URLs that should all be pointing to the one page, example: domain.com/about-us, also has a listing in GA as domain.com/about-us/index.htm Is this some kind of indication of a subdirectory issue? Has anyone had experience with this in such wordpress plugins as Yoast SEO, or other SEO plugin? My thoughts here are to simply redirect any of these non-existent files with a redirect in .htaccess - but what I'm using isn't working. I will insert the redirect here - - and any help would be greatly appreciated. RewriteEngine onRewriteCond %{THE_REQUEST} ^./index.html?
Reporting & Analytics | | cceebar
RewriteRule ^(.)index.html?$ http://www.dupontservicecenter.com/$1 [R=301,L] and this rewrite doesn't work: RewriteEngine on
RewriteRule ^(.+).htm$ http://dupontservicecenter.com/$1.php [R,NC] _Cindy0 -
Tasks for Google Analytics training
Hi Mozzers, I'm delivering some Google Analytics (Fundamentals level) training, and trying to make it was fun and as interesting as possible... which is quite a challenge when it comes to GA. I was just wondering if you're aware of training tasks, or interactions, I could bring into this kind of training session? The group are particularly interested in user journeys and the effectiveness of content. Thanks!
Reporting & Analytics | | A_Q0 -
Can you track two Google Analytics Accounts on one site?
If you have a site that had an old analytics account and then implemented a new one is it possible to run tracking code that records to both accounts without causing your site or data issues? We are doing this so we don't loose data at any point - ideally it wouldn't have been split between the two but making one redundant isn't an option. Ideally we would have merged the data from both accounts and had one - however the research we have done points to this not being a possibility - unless one of you guys knows different? It would be great if anyone has experience on any this.. Thanks
Reporting & Analytics | | ChrisAllbones0 -
How to safely exclude search result pages from Google's index?
Hello everyone,
Reporting & Analytics | | llamb
I'm wondering what's the best way to prevent/block search result pages from being indexed by Google. The way search works on my site is that search form generates URLs like:
/index.php?blah-blah-search-results-blah I wanted to block everything of that sort, but how do I do it without blocking /index.php ? Thanks in advance and have a great day everyone!0 -
Why is Google Analytics showing index.php after every page URL?
Hi, My client's site has GA tracking code gathering correct data on the site, but the pages are listed in GA as having /index.php at the end of every URL, although this does not appear when you visit the site pages. Even if there is a redirect happening for site visitors, shouldn't GA be showing the pages as their redirect destination, i.e. the URL that visitors actually see? Could this discrepancy be adversely affecting my search performance? Example page: http://freshstarttax.com/innocent-spouse/ shows up in GA as http://freshstarttax.com/innocent-spouse/index.php thanks
Reporting & Analytics | | JMagary0 -
What is s.ytimg.com in google analytics?
My clients GA reports 273 visits from s.ytimg.com. I go to the site, it doesn't exist. I googled it, there were some code with s.ytimg.com in it, but nothing I could understand. Anybody have an idea where this comes from?
Reporting & Analytics | | endlessrange0