Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Query on Google Experiment?
Hello All, I am doing A/B testing via google analytic experiment now my query is for my ecommerce site homepage i am trying to add newsletter in 3 different way so it will be variation A, Variation B and Variation C. So my query is what should be then original page? Currently there is no newsletter form on homepage. Do I consider original page as Variation A? i.e. abcd.com?variation-A as original page? Then google will decide winner between original page i.e. variation A and Variation B & C Thanks!
Reporting & Analytics | | dsouzac0 -
Misconfigured event tracker on a site
I'm on the case to diagnose an unnaturally low bounce rate, as in under 2%. Yes, the site has two Universal analytics tags, but Google says that would not compromise data (https://support.google.com/analytics/answer/1032400?hl=en). So I am trying to figure out how to look at event tracking to see if there is a configuration error. But, I am not a developer so am unsure how to review and would welcome the help of the Moz community. The site is http://corsa.com if you wish to take a peek. Thanks!
Reporting & Analytics | | butler_emily0 -
Mobile Site on Google Analytics
Hi mozzers, We just launched a mobile site and I was wondering what are the main steps to follow for gettting your mobile site tracked via GA (m.example.com)? We have a profile for www.example.com GATC: javascript or PHP to install? Should the profile be on a subdomain? What else to consider when implementing a mobile site on GA? Thanks
Reporting & Analytics | | Ideas-Money-Art0 -
Ways to analyze a 1M rows dataset of search queries
Hi, I have this large dataset, about 1 million search queries with visits, bounce rate and a few other metrics. I'm trying to explore this data to find keyword "buckets" (such as include product name, location name, transactional objective, informational, etc.), as well as explore the density of certain keywords (keywords as in instances of a single word amongst all queries) My idea was to use Excel and a macro to split all queries in separate words (also clearing punctuation and uppercase/lowercase), then storing this word in a new worksheet, adding to another column the visit counts from the row where the word was extracted (as to give a sense of weight). Before adding the word to the new worksheet, the script will look if the word already existed, if so it would just add the current value of visits to the existing visit counts etc. In the end it will create sort of a "dictionary" of all the keywords in all search queries ranked by weight (= visits from search query including this keyword) This would help me get started I believe, because I can't segment and analyze 1M raw search queries... My issue is: this VBA has been running on my (fast) PC for the last 24hr and it doesn't seem to get to an end. Obviously excel+VBA is not the best way to do text mining and manipulation in such a large dataset (although it's just a 30mb file) What would you do if you had this dataset and would like to mine the text/semantic as I am doing? Any idea of tools? process? I'm considering dumping this data into a MySQL db and doing the processing through PHP (the only backend language I'm versed in), and getting the "summified" data stored into another table, which I'll then be able to export to a Excel for analysis. But I'm afraid that I'll be facing memory limit issues and such... In the meantime, I'm definitely interested into knowing what you guys would do if you had this data and wanted to simply start exploring its constituencies Thanks!
Reporting & Analytics | | briacg0 -
Google Analytics Goal Tracking Head Match w/ Query Strings
Hello, I have what should be a simple question here but there is a small nuisance I am trying to make sure I have configured correctly. We have a product based website w/ no e-commerce because they sell through a dealer network. All these product pages have "Where to Buy" links and the URL after you click where to buy always uses the query string ?r=XXX. Example: www.mysite.com/product/category/subcategory/product-name?r=12345 I want to setup a goal in GA with a URL and configure head match on the "?r" but which of the following is exactly how it should be configured with the "Goal URL" ?r= ?r r= Does it matter, because I had it setup as "?r" and it was never registering any goals. Do I need to leave off the "?" and just have it be r= Thanks in advance for the respones.
Reporting & Analytics | | Bevelwise0 -
What is this referrer site?
Hi Guys....i keep seeing this in my analytics..can someone tell me what it is? 146w.bay146.mail.live.com thanks for your time
Reporting & Analytics | | nomad-2023230 -
Setting up Google Analytic Goals to a 3rd Party Site
I recently received help on a question I asked on SEOmoz but need additional clarification. I am trying to set up goals in Google Analytics for people who click on a “purchase botton” which sends them to PayPal. I created a Thank You page and tried to get PayPal to redirect to it, however, our customers only get to our site’s 404 page. Here is what I’ve done so far: Went into my PayPal account and turned the “Auto Return” to ‘on’ Under website payment preferences, I added the following URL http://www.teecycle.org/thank-youutm_nooverride1. (I formatted the URL this way because the person who provided me with help recommended using the format ?UTM_nooverride=1. However, our CMS system won’t allow “?” or “=”)
Reporting & Analytics | | EricVallee340 -
Setting up Goals in Google Analytics that involve a 3rd party site
I've set up several goals for one of my clients in Google Analytics. The ones that relate to things on the site -- such as clicking on the "Contact Us" button -- work just fine. However, I set up one that is tracking when someone clicks on a purchase button, which sends the user to a third party site (PayPal). This one doesn't seem to work. (I purchased and item and the goal was not recorded). Looking to see if I have to do anything different when setting up the goal.
Reporting & Analytics | | EricVallee340