How long does it take for customized Google Site Search to show results from pdf files?
-
The site in question is http://www.ejmh.eu
I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.
We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek
The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.
My questions:
If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?
Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?
Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?
It is vital to have a proper, high-quality search functioning on that site in the very near future.
What are your experiences? Any tips are greatly appreciated.
-
Hi, everyone: problem solved.
Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.
I updated the general sitemap.xml and linked to the new sitemap as well.
I (re)submitted both sitempas via the Webmaster Tools.
Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.
-
It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.
Otherwise it just takes too long for Google to index them.
What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.
-
Last crawl - thanks, great info.
yes, all new pdfs are linked from the html files.
This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html
In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.
-
Do you have the new PDFs Linked from pages like the old ones?
Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)
-
You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.
Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.
As for the settings...can you plase help me further: what exactly would you change?
-
if you check here all the pdf are indexed in google
so i will check the settings on CSE
reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms
-
Thanks for the tip, it's a good one. But they are all 100% texts.
-
If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.
so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Favicon not showing in google serps
Hi, I have a website where the favicon is not showing in the google mobile serps. It's appearing the default icon instead (world icon). This is the tag I have place in the head section of the website: <link rel="shortcut icon" href="/favicon.ico" /> The size of the favicon is 48x48 and it's appearing correctly in the browser tag. I've checked that the google robot can crawl it and in the server logs I can see requests from the "Google Favicon" user-agent. Has anyone had this same problem? Any advice?
Technical SEO | | dMaLasp0 -
New SEO manager needs help! Currently only about 15% of our live sitemap (~4 million url e-commerce site) is actually indexed in Google. What are best practices sitemaps for big sites with a lot of changing content?
In Google Search console 4,218,017 URLs submitted 402,035 URLs indexed what is the best way to troubleshoot? What is best guidance for sitemap indexation of large sites with a lot of changing content? view?usp=sharing
Technical SEO | | Hamish_TM1 -
How long does it take for Webmaster Tools to index a site?
I submitted my client's site about a week ago. It had 138 links, it's still at 43 links. Should it be taking that long to index? Thanks! Luciana
Technical SEO | | Luciana_BAH1 -
How to remove my cdn sub domins on Google search result?
A few months ago I moved all my Wordpress images into a sub domain. After I purchased CDN service, I again moved that images to my root domain. I added User-agent: * Disallow: / to my CDN domain. But now, when I perform site search on the Google, I found that my CDN sub domains are indexed by the Google. I think this will make duplicate content issue. I already hit by the Panguin. How do I remove these search results on Google? Should I add my cdn domain to webmaster tools to request URL removal request? Problem is, If I use cdn.mydomain.com it shows my www.mydomain.com. My blog:- http://goo.gl/58Utt site search result:- http://goo.gl/ElNwc
Technical SEO | | Godad1 -
Google seems to be penalizing my site for some reason
I recently took control of a website which did have some pretty big SEO problems, duplicate content being one of the main ones!! Looking back at ranking data the website ranked very well for it's main keyword, #5 for Google, Yahoo and Bing. The ranking then dropped in February 2012 for Google to #64 but stayed the same for Yahoo and Bing. I scrapped the dodgy content and completely rewrote it using a Wordpress framework about 6 weeks ago, still targeting the same keywords and 301 redirecting the old pages to the new pages where applicable. My rankings for Yahoo and Bing are still maintaining their page 1 rankings but Google is still ranking the website on page 5/6. My question is. Is the website getting punished for something that was part of the old website? If so how can I find out what it is and fix it? This website ranked on page 1 for Google for most of it's popular keywords but now it doesn't. I appreciate any feedback Many Thanks : )
Technical SEO | | alexhowe0 -
Will an identical site impact SERP results
I came across two identical sites for two different business owners in the same industry. I'm sure you've seen these. A web company offers individuals in the same profession a template site with the exact same content for each site. All that is different is the domain. i.e. mycompany.com/news/topicsname will have the exact same content, images, tags, etc. as mycompany2.com/news/topicsname. I would assume having the duplicate content, especially if two site owners are in the same town, will ultimately hurt the rankings of at least one site. Is this correct? Thank you for your help.
Technical SEO | | STF0 -
I think google thinks i have two sites when i only have one
Hi, i am a bit puzzled, i have just used http://www.opensiteexplorer.org/anchors?site=in2town.co.uk to check my anchor text and forgot to put in the www. and the information came up totally different from when i put the www. in it shows a few links for the in2town.co.uk but then when i put in www.in2town.co.uk it gives me different information, is this a problem and if so how do i solve this | | | | | | | | |
Technical SEO | | ClaireH-184886
| | | | | | | | |0 -
How long does it take for an article or a page to be listed by google
Hi, my question is a two parter. I think i must be doing something wrong. With my site map, it is set to show different section of my site while on my old site the site map listed every single article - i am not sure if setting it to each section is correct, can someone please advise me on this. The second part of the question is, how long does it take for an article to be listed by google. This article on my site was written today http://www.in2town.co.uk/lifestyle/holidaymakers-ignore-the-importance-of-travel-insurance-according-to-survey Holidaymakers Ignore The Importance of Travel Insurance According To Survey but when i check to see if google has listed the article yet by putting in the whole title, it does not come up, i even added the website name at the end and still it did not come up. This is worrying me a bit as a lot of my articles are news stories which means they are current articles so if google is not picking them up then no one else will be. can anyone let me know what i should be doing so google picks them up quicker please.
Technical SEO | | ClaireH-1848860