How long does it take for customized Google Site Search to show results from pdf files?
-
The site in question is http://www.ejmh.eu
I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.
We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek
The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.
My questions:
If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?
Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?
Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?
It is vital to have a proper, high-quality search functioning on that site in the very near future.
What are your experiences? Any tips are greatly appreciated.
-
Hi, everyone: problem solved.
Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.
I updated the general sitemap.xml and linked to the new sitemap as well.
I (re)submitted both sitempas via the Webmaster Tools.
Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.
-
It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.
Otherwise it just takes too long for Google to index them.
What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.
-
Last crawl - thanks, great info.
yes, all new pdfs are linked from the html files.
This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html
In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.
-
Do you have the new PDFs Linked from pages like the old ones?
Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)
-
You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.
Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.
As for the settings...can you plase help me further: what exactly would you change?
-
if you check here all the pdf are indexed in google
so i will check the settings on CSE
reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms
-
Thanks for the tip, it's a good one. But they are all 100% texts.
-
If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.
so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
International Targeting - Google Search Console not recognizing the tags
Hi, We are facing a problem with international targeting not being recognized by the Google Search Console. This is the URL to which we added the following tags: URL: http://kilgray.com/memoq/2015-100/help-en/index.html TAGS: Flang tool Result: http://screencast.com/t/rrBgcr1X Search Console result: http://screencast.com/t/fP45ZR2c I am a bit lost here, as the tags were validated also from different members of the community. Is this because of the frames? (Yes, the site is built in frames). Thanks for your help!
Technical SEO | | Kilgray0 -
Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
Technical SEO | | Pete40 -
Why did Google stop indexing my site?
Google used to crawl my site every few minutes. Suddenly it stopped and the last week it indexed 3 pages out of thousands. https://www.google.co.il/#q=site:www.yetzira.com&source=lnt&tbs=qdr:w&sa=X&ei=I9aTUfTTCaKN0wX5moCgAw&ved=0CBgQpwUoAw&bav=on.2,or.r_cp.r_qf.&fp=cfac44f10e55f418&biw=1829&bih=938 What could cause this to happen and how can I solve this problem? Thanks!
Technical SEO | | JillB20130 -
My site was Not removed from google, but my most visited page was. what does that mean?
Help. My most important page http://hoodamath.com/games/ has disappeared from google, why the rest of my site still remains. i can't find anything about this type of ban. any help would be appreciated ( i would like to sleep tonight)
Technical SEO | | hoodamath0 -
Site wide search v catalogue search
I have a client building a new web site who has agreed that a site search function is a good thing in order to get a view on how customers are using the site, the search terms they are using as a source of keywords etc. The problem is the developer has implemented a catalogue/product search which only queries the products in the database. On the one hand this is fine in that the search is directing users to products and not to other areas of the site. But the customer is disappointed that the search is not site wide. Are there any solutions where third party search utility could be implemented whithin the site which will search both? The ecommerce platform is Magento. Any views would be very helpful!
Technical SEO | | k3nn3dy30 -
Prevent mobile site from appearing in the sitelinks of desktop search
Hi, IWe have this mobile page that keeps on appearing in the google search. I even try to put it in the robots.txt to disallow the crawler but still it keeps on popping on the search results. How can I prevent it from displaying?
Technical SEO | | shebinhassan0 -
My site has vanished from google
Hi my site has vanished from google. We have been for a very long time. for example if you put in gastric band hypnotherapy then we would be first page number two and also lots of other keywords but now we have vanished from google and i do not know why or how to solve this. can anyone please help me and help me understand what i need to do to solve this please My site is http://www.clairehegarty.co.uk I am not sure if i have been banned or why i have dropped out of google
Technical SEO | | ClaireH-1848860 -
Www or no www in search results??
I am working with a client, and when I check on SERP placement, I never see the "www" in the SERP's only nameofcustomer.com not www.nameofcustomer.com. Of course there is a redirect going on...Question is...should this matter at all? I dont understand the relationship between this kind of redirect and SEO. Thank Mozzers
Technical SEO | | Giggy0