Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question on structuring URLs in a Drupal CMS - Adverse SEO or Analytics impacts?
Hello Moz Community, We're building out a health system (think a bunch of hospitals and clinics etc.) website on Drupal for the first time. Nebraskamed.com is our domain. Because we're using nodes instead of pages, our URL structure can pretty much be whatever we think makes sense. Our proposal is to drop /blog/ and related terms from the URL structure, because it doesn't really mean anything to the user. Instead, we'd use the service line "cancer" for example, followed by the name of the blog post or document. Example: nebraskamed.com/cancer/10-bone-cancer-myths Do you see any red flags (perhaps with SEO or Analytics for example) to what I'm proposing? domain name/service line/blog-post-name If so, do you have a URL structure you advise?
Reporting & Analytics | | Patrick_at_Nebraska_Medicine1 -
Multiple GA codes, one site.
Hi all, Is anyone running two GA codes on one website successfully? My organisation own a number of websites so we used to have one global GA code on all our sites to track global stats, and then we would also have site unique GA on each property to just track that one property. This worked fine, but of late we seem to be getting no data from the globally based code. Obviously, with the site-specific codes we can enter the name for that domain in GA but for the overall code, it is called 'all.com' I'm wondering if Google has now tied the GA domain to the code or if we are doing something wrong. All the codes are the same as they always were but have stopped working. As a stop gap, we have swapped to using Piwik as the all.com code. However, we are then comparing the stats in two different analytics programs so will get a different result. Also, it would be nice to be able to add the all.com to tools such as this to generate weekly reports. Anyone else having GA woe like this? Thanks. Carl
Reporting & Analytics | | WonkyDog0 -
Misconfigured event tracker on a site
I'm on the case to diagnose an unnaturally low bounce rate, as in under 2%. Yes, the site has two Universal analytics tags, but Google says that would not compromise data (https://support.google.com/analytics/answer/1032400?hl=en). So I am trying to figure out how to look at event tracking to see if there is a configuration error. But, I am not a developer so am unsure how to review and would welcome the help of the Moz community. The site is http://corsa.com if you wish to take a peek. Thanks!
Reporting & Analytics | | butler_emily0 -
Www.googleadservices.com/pagead/conversion_async.js what is this url doing on my site?
Hello Guys, I am using google tagmanager and i have configured adwords in tag manager now what i find is that this link - www.googleadservices.com/pagead/conversion_async.js showing on my homepage not in view source but when i do inspect element at that time it appears. So do you think after using google tag manager still i need to use the given link? Thanks, Raghu
Reporting & Analytics | | raghuvinder0 -
Question re Google Analytics and its more accurate alternatives
Hi guys There are two main issues we have with Google Analytics, and I'd really appreciate if anyone has the time to give an answer to that. We completely miss organic traffic data before 7/22/2013 although our account is active since 2005. Any thoughts on that? Is it the not provided move that swiped out all data or something else? Even for the data we do have there is lots of inaccuracies, and we are thinking on switching or at least adding a new analytics software, any recommendations? (FYI, it turns out we do not keep access logs on the server for more than 2 months, and we might fix that for future references, but now we are looking for external solution). Any help will be much appreciated Thanks Lily
Reporting & Analytics | | wspwsp0 -
Webmaster Tools Suddenly Asking For Verification of Site Registered for 5 Years
Google Webmaster Tools has been successfully installed on my website, (www.nyc-officespace-leader.com) for more than five years. Suddenly, today I have received a request to Verify this Site". This makes no sense. The only possibility I can think of is that this is somehow tied to the following events in the last month: 1. Launch of new version of website on June 4th
Reporting & Analytics | | Kingalan1
2. Installation of Google of Tag Manager
3. Sudden Increase in number of pages indexed by Google. Unexplained indexing of an additional 175 pages. About 625 pages should be indexed, while 800 are now indexed. In the last month ranking and traffic have fallen sharply. Could it be tat these issues are all linked? But the strangest issue is the request to verify the site. Does anyone have any ideas? Thanks,
Alan0 -
My GA code is on my site but Google Analytics isn't being pulled into SEOMoz...why?
The CEO wants me to present an SEO plan next week for three of our sites; however, I got this message when I went to campaign overview tab: "It appears there's a problem with our connection to your Google Analytics account. Please go to your Settings page to update your connection." I double-checked the GA code and it's the same on both our site and in SEOMoz...what gives? I clicked on Choose Your GA Profile->Set GA Account and Profile then got this warning: "Are you sure you want to change your Google Analytics connection? Changing your connection will reset our cache of your historical GA traffic data." I need this data pronto so I can set strategy for three sites; any help would be greatly appreciated! Darrell
Reporting & Analytics | | AdviceElle0 -
If a site has 301 redirect - Will the Analytics of the target site show it as a referral or as the traffic source it came from?
Lets say I have a site www.abc.com and I rederect that site to www.xyz.com. If ABC.com is still ranking for keyword X and orgnically someone searches for X and they click on the ABC.com listing - In the XYZ site analytics (which is the target site) does it show as organic or referall, direct? Thanks
Reporting & Analytics | | M_80