Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New Site Worries
To cut a long story short, our old web developers who built us a bespoke site decided that they could no longer offer us support so we decided to move our back end to the latest Magento 2 software and move over to https with a new company. The new setup has been live for 3 weeks, I have checked in webmaster tools and it says we have 4 pages indexed, if I type in site:https://www.mydomain.com/ we have 6560 pages indexed, our robots.txt file looks like this:Sitemap: https://www.mydomain.com/sitemap.xml Sitemap: https://www.mydomain.com/sitemaps/sitemap_default.xml I use Website Auditor and Screaming Frog, Website Auditor returns a 302 for my domain and Screaming Frog returns a 403 which means I cannot scan any of these. If I check my domain using an https checking tool some sites return an error but some return a 200.
Reporting & Analytics | | Palmbourne
I have spoken to my new developer and he says everything is fine, in Webmaster tools I can see some redirects from his domain to mine when the site was in testing mode. I am concerned that something is not right as I always check my pages on a regular basis. Can anyone shed any light on this, is it right or am I right to be concerned. Thank you in advance0 -
Filter Tracking works fine at staging site but not on LIVE site why?
Hello Expert, For my ecommerce site I want to track filter url's like price range, size, width, color etc and fully filter url should display in google analytic. I have implemented filter tracking at staging server and it works perfectly but on LIVE site it not show me full filter url. Do you guys think any parameter which i have configured in search console affect this? Note - I have configured in this way - http://webmasters.stackexchange.com/questions/93008/how-to-track-a-product-filter-in-the-product-list-view-with-google-analytics My filter url's are given below. And in search console I have configure two parameters. 1) effect - Sort, Crawl - No urls 2) FT - effect- ( - ) , crawl - Let google bot decide. But as per me this parameter is for crawling should not affect tracking right? mysite.com?FP=0&filtSeq=Price&Sort=BS
Reporting & Analytics | | adamjack
mysite.com?FT=7581&filtSeq=Type&Sort=BS
mysite.com?FT=1042&filtSeq=Colour&Sort=BS In robot file nothing is block. In analytic it showing me url till mysite.com only where as in staging it shows me full filter url. Thanks!0 -
Misconfigured event tracker on a site
I'm on the case to diagnose an unnaturally low bounce rate, as in under 2%. Yes, the site has two Universal analytics tags, but Google says that would not compromise data (https://support.google.com/analytics/answer/1032400?hl=en). So I am trying to figure out how to look at event tracking to see if there is a configuration error. But, I am not a developer so am unsure how to review and would welcome the help of the Moz community. The site is http://corsa.com if you wish to take a peek. Thanks!
Reporting & Analytics | | butler_emily0 -
Difference between site: search and Total Indexed in Google Webmaster Tools.
This morning I did a search on Google for my site using the site: operator. I noticed that the number of results returned was significantly different than the "Total indexed" in Google Webmaster Tools. What is the difference and is it normal to have two very different numbers here?
Reporting & Analytics | | Gordian0 -
Rank #1 for a 110,000/month query search, but barely any traffic?
Hi guys, As it says in the title, we've recently reached the absolute #1 position for a certain key phrase in the travel industry which the Google Keyword Tool tells me averages 110,000 local (165,000 global) searches a month... however we have received barely any traffic at all over the past TWO months for it and I'm trying my best to determine why. We've checked on multiple different devices with all forms of personalisation off, different browsers, 3G connections as opposed to office Wi-FI etc. and it still returns us as the #1 rank. Meta descriptions and title tags are pretty much pristine if I don't say so myself, however what should be a very lucrative key phrase is currently returning little to no traffic results. Has anyone had experience in a similar situation to this? Any possible causes that I might be missing? Would greatly appreciate any help. Thanks.
Reporting & Analytics | | ExperienceOz0 -
Measuring events to external sites
Im having problem measuring click on ads by using events in GA or Jetpack. For example when I checked out yesterday this is what I read: 1. In GA events it says 12 clicks 2. In Jetpack it says 9 clicks But when I look at Referrals to the actual site directly it says 18 clicks Which one is the rights one? I need this because I use this to invoice clients end of month! and it cant be any "maybe".something. cheers, R
Reporting & Analytics | | rrrobertsson0 -
Setting up Analytics on a Site that Uses Frames For Some Content
I work with a real estate agent and he uses strings from another tool to populate the listings on his site. In an attempt to be able to track traffic to both the framed pages and the non-framed pages he has two sets of analytics code on his site - one inside the frame and one for the regular part of the site. (there's also a third that the company who hosts his site and provides all these other tools put on his site - but I don't think that's really important to this conversation). Not only is it confusing looking at the analytics data, his bounce rate is down right unmanageable. As soon as anyone clicks on any of the listings they've bounced away. Here's a page - all of those listings below " Here are the most recent Toronto Beaches Real Estate Listings" are part of a frame. http://eastendtorontohomes.com/toronto-beach-real-estate-search/ I'm not really sure what to do about it or how to deal with it? Anyone out there got any good advice? And just in case you're wondering there aren't any other options - apart from spending thousands to build his own database thingie. We've thought about that (as other agents in the city have done that), but just aren't sure it's worth it. And, quite frankly he doesn't want to spend the money.
Reporting & Analytics | | annasus0 -
New Google Analytics Site Speed tool and excel
Hello, I was wondering if there is a good tool or method to pull the new Google Analytics Site Speed data into excel and use this document to track site speeds on a weekly basis for multiple clients? Any good articles or how-to's would be awesome!
Reporting & Analytics | | Hakkasan0