Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Explore by site - Site overview's servers
Hello, When I want to "Explore by site" and make a "Site overview", I have only 4 choices for the region : USA United Kingdom Canada Australia But the location of my business is in Chile.
Reporting & Analytics | | Sodimaccl
Does this have any repercussion or negative impact in the analytics ? Thank you.0 -
Question about setting up Google Webmaster on Network Solutions?
I'm trying to set myself up as a Webmaster on my company's site. We use Network Solutions. I am following Google's directions on how to do this. However, I am a bit wary. Here are the directions. Underlined is the part I am having trouble with: Log in to your account for howlatthemoon.com at www.networksolutions.com by clicking theManage Account icon. In the left navigation bar, open the nsWebAddress (Domains) menu by clicking the **+ **icon. Click Manage Domain Names. On the Domain Details page for the domain you're using, select the Designated DNS radio button (to the right of Change domain to point to) and click the Apply Changes button. If you've previously modified your advanced DNS settings, click Edit (to the right ofDomain currently points to). Under the Advanced DNS Manager heading, click Manage Advanced DNS Records. Under the Text (TXT Records) heading, click Add/Edit. In the Host field, enter @. Leave the TTL field set to the default value. In the Text field, copy and paste the following unique security token:
Reporting & Analytics | | howlusa
(security token removed for obvious reasons) Click Continue. Review your changes and click Save Changes. When you've done saving the TXT record, click the Verify button below on this page. There is already a host of @ (None). The text for it reads: v=spf1 include:_spf.google.com ~all I called Network Solutions and the guy I was speaking with told me to delete it and replace it with my Google Webmasters code. However, I think this is setting up our email. Do I just add the Webmasters in and have two hosts of @ (None)? Thanks!0 -
Linking Multiple Niche Site In Same Google Analytics Account
Hi, I am providing SEO for Local business. Is it advisable to separate out the Google Analytics into different Google account or is it ok to remain it this way? Some of the client might be in the same niche, and might be competing with the same keywords as well. What I was worried is, Google might see these sites as same owner and only rank for 1 of the site. I was thinking to get the owners to register for their own Google Analytics and share the access to me.
Reporting & Analytics | | JonathanSoh0 -
Does anyone know what's happened to google analytics -> traffic sources -> SEO -> queries many of my accounts are showing a drop to zero in the laste few days
Howdy mozzers It's in the question title really. Zero impressions showing for the last few days on. Multiple accounts Any thought out there
Reporting & Analytics | | Big_Partnership0 -
Get search query information for wildcard subdomains
For various reasons, I have a couple of domains with a couple of hundred subdomains on each. I would love to track how many search impressions they get each day as an overall site in webmaster tools. I have this information for my normal domains and it's a great way to track progress up the rankings (in addition to hits). Anyone know if it's possible to add wildcard subdomains to webmaster tools or any work around? I have tried *.domain.com but it's not accepting it. Thanks
Reporting & Analytics | | Grumpy_Carl0 -
If you have G+ buttons on your site, does google still suggest you add them?
We've had G+ buttons on the site for many months now (Can't remember exactly when they were added.) Yet in Google Webmaster Tools, they still give me this message: "Get more recommendations in Google Search and grow your audience on Google+. Add the Google+ badge to your site." Is this happening to everyone, or is it just me? Do they think the buttons aren't there? Also, they say this: "Your site doesn't have enough +1's yet to show characteristics." According to the stats, 551 unique people have +1'd our pages. How many does it take, to get stats? Anyone willing to give stats?
Reporting & Analytics | | loopyal0 -
Setting up Analytics on a Site that Uses Frames For Some Content
I work with a real estate agent and he uses strings from another tool to populate the listings on his site. In an attempt to be able to track traffic to both the framed pages and the non-framed pages he has two sets of analytics code on his site - one inside the frame and one for the regular part of the site. (there's also a third that the company who hosts his site and provides all these other tools put on his site - but I don't think that's really important to this conversation). Not only is it confusing looking at the analytics data, his bounce rate is down right unmanageable. As soon as anyone clicks on any of the listings they've bounced away. Here's a page - all of those listings below " Here are the most recent Toronto Beaches Real Estate Listings" are part of a frame. http://eastendtorontohomes.com/toronto-beach-real-estate-search/ I'm not really sure what to do about it or how to deal with it? Anyone out there got any good advice? And just in case you're wondering there aren't any other options - apart from spending thousands to build his own database thingie. We've thought about that (as other agents in the city have done that), but just aren't sure it's worth it. And, quite frankly he doesn't want to spend the money.
Reporting & Analytics | | annasus0 -
Site crawler hasn't crawled my site in 6 days!
On 4.23 i requested a site crawl. My site only has about 550 pages. So how can we get faster crawls?
Reporting & Analytics | | joemas990