Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Explore by site - Site overview's servers
Hello, When I want to "Explore by site" and make a "Site overview", I have only 4 choices for the region : USA United Kingdom Canada Australia But the location of my business is in Chile.
Reporting & Analytics | | Sodimaccl
Does this have any repercussion or negative impact in the analytics ? Thank you.0 -
Should Google Trends Match Organic Traffic to My Site?
When looking at Google Trends and my Organic Traffic (using GA) as percentages of their total yearly values I have a correlation of .47. This correlation doesn't seem right when you consider that Google Trends (which is showing relative search traffic data) should match up pretty strongly to your Organic Traffic. Any thoughts on what might be going on? Why isn't Google Trends correlating with Organic Traffic? Shouldn't they be pulling from the same data set? Thanks, Jacob
Reporting & Analytics | | jacob.young.cricut0 -
Multiple GA codes, one site.
Hi all, Is anyone running two GA codes on one website successfully? My organisation own a number of websites so we used to have one global GA code on all our sites to track global stats, and then we would also have site unique GA on each property to just track that one property. This worked fine, but of late we seem to be getting no data from the globally based code. Obviously, with the site-specific codes we can enter the name for that domain in GA but for the overall code, it is called 'all.com' I'm wondering if Google has now tied the GA domain to the code or if we are doing something wrong. All the codes are the same as they always were but have stopped working. As a stop gap, we have swapped to using Piwik as the all.com code. However, we are then comparing the stats in two different analytics programs so will get a different result. Also, it would be nice to be able to add the all.com to tools such as this to generate weekly reports. Anyone else having GA woe like this? Thanks. Carl
Reporting & Analytics | | WonkyDog0 -
Launching a new site
What is the best method for Google Analytics implementation? Should I use the same UA id for the new site, or create an new one for the new site?
Reporting & Analytics | | brianvest0 -
Google Analytics SEO Queries Not Showing
Hi All, This might be a silly question, but for all the properties I monitor in Google Analytics, I'm now showing no data for SEO Queries under Acquisition for the past 6 days. Normally I would expect a few day delay in queries, but nothing for 6 days is somewhat peculiar especially as it was functioning fine prior to November 12th. Does anyone have insight into what might be going on? Thanks! URaNMa3
Reporting & Analytics | | amichaels0 -
My GA code is on my site but Google Analytics isn't being pulled into SEOMoz...why?
The CEO wants me to present an SEO plan next week for three of our sites; however, I got this message when I went to campaign overview tab: "It appears there's a problem with our connection to your Google Analytics account. Please go to your Settings page to update your connection." I double-checked the GA code and it's the same on both our site and in SEOMoz...what gives? I clicked on Choose Your GA Profile->Set GA Account and Profile then got this warning: "Are you sure you want to change your Google Analytics connection? Changing your connection will reset our cache of your historical GA traffic data." I need this data pronto so I can set strategy for three sites; any help would be greatly appreciated! Darrell
Reporting & Analytics | | AdviceElle0 -
Google Analytics Tracking Code Queries
Hello, I have taken on a new client who has Google Analytics installed. The tracking code is set to 'single domain'. Recently they added a mobile site using a sub-domain (m.website.com) which means that Google Analytics is not picking up this traffic. I want to revise the account so that I have a master account (raw data) and then profiles for the mobile site, main domain (www.website.com) and one other for a sub-domain that they are using. I am aware that there is mobile specific tracking code however I thought it would be easier (re conversions/goals/eCommerce tracking) to not use this and by changing the account to 'multiple domains' we could also get data for another sub-domain that they are using . My questions are: Am I right to want to use individual profiles over web properties. If not please explain why. When installing the tracking code (where the profile number is changing) I believe that I need to add that code with the changing profile number to the sub-domain sections. So my question is a) is that correct, and b) if I use a profile number on a sub-domain section will the master account still gather the data for the main URL as well as all sub-domains. If I change the master account from using 'single domain' tracking code to 'multiple domain' tracking code will this affect historical data? Will I lose the data? When changing from 'single domain' tracking to 'multiple domain' tracking does this affect eCommerce tracking? Or do we only need to be adding the additional lines of tracking code that allow sub-domains to be tracked? The web developers are using asynchronous code however half is in the and the other half is at the bottom of the source code. Given that traffic is being reported in the Google Analytics account should I have any concerns that the code is split? I have done a lot of reading but seem to be going around in circles, so your help is much appreciated! Thanks,
Reporting & Analytics | | Unity
Dinny0 -
Will Google start trimming 'stale' sites rank?
With the recent focus on Google to reduce rank of farms and low value sites, I am interested to get SEO view on if you think Google will start devaluing stale sites. I do find it a bit frustrating that in the top 5 for my main key phrase, there is one site that has NO content just an error and another blog that has not updated content in 2 years. How can blogs that do not blog be considered high enough value by Google to rank in the top 5? How can sites that just return 404 or 500 for ALL their pages be even considered a site let alone rank 2nd. I am interested so see others experiences and thoughts on 'user experience' clean ups by Google and why these types of sites get missed?
Reporting & Analytics | | oznappies0