How to tell if PDF content is being indexed?
-
I've searched extensively for this, but could not find a definitive answer.
We recently updated our website and it contains links to about 30 PDF data sheets. I want to determine if the text from these PDFs is being archived by search engines.
When I do this search http://bit.ly/rRYJPe (google - site:www.gamma-sci.com and filetype:pdf) I can see that the PDF urls are getting indexed, but does that mean that their content is getting indexed?
I have read in other posts/places that if you can copy text from a PDF and paste it that means Google can index the content. When I try this with PDFs from our site I cannot copy text, but I was told that these PDFs were all created from Word docs, so they should be indexable, correct?
Since WordPress has you upload PDFs like they are an image could this be causing the problem?
Would it make sense to take the time and extract all of the PDF content to html?
Thanks for any assistance, this has been driving me crazy.
-
Kyle,
Thanks for the quick response. The data is being displayed in the title and meta description field. I also did some searches for specific terms with my parameter search from our site and filetype:pdf, which shows that the content is being indexed. It also shows that the PDF titles and meta descriptions are not optimized, so I have some work there.
Thanks,
Anthony
-
Is the data being displayed in the title and meta description in the SERP content from the PDF?
If so, then yes, they are being indexed/crawled.
Regards,
Kyle
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Indexing with Keyword
Hi, My webpage url is indexed in Google but don't show when searching the Main Keyword. How can i index it with keyword. It should show on any SERP when the keyword is searched. Any suggestions.
Technical SEO | | green.h1 -
Anything new if determining how many of a sites pages are in Google's supplemental index vs the main index?
Since site:mysite.com *** -sljktf stopped working to find pages in the supplemental index several years ago has anyone found another way to identify content that has been regulated to the supplemental index?
Technical SEO | | SEMPassion0 -
What about Panoramic content ?
Hello everyone ,, We have a website include a panoramic images for many pages this panorama is really unique and we did a hard work to collect it , we thought that will be very useful for our target audience !! We have tried to search about how to make a panoramic content working and support the SEO , Unfortunately NO result and NO information yet, _Could you help us in that filed _ _Thanks _
Technical SEO | | Visual-ex0 -
Duplicate Page Content
Hi, I just had my site crawled by the seomoz robot and it came back with some errors. Basically it seems the categories and dates are not crawling directly. I'm a SEO newbie here Below is a capture of the video of what I am talking about. Any ideas on how to fix this? Hkpekchp
Technical SEO | | mcardenal0 -
404 and Duplicate Content.
I just submitted my first campaign. And it's coming up with a LOT of errors. Many of them I feel are out of my control as we use a CMS for RV dealerships. But I have a couple of questions. I got a 404 error and SEO Moz tells me the link, but won't tell me where that link originated from, so I don't know where to go to fix it. I also got a lot of duplicate content, and it seems a lot of them are coming from "tags" on my blog. Is that something I should be concerned about? I will have a lot more question probably as I'm new to using this tool Thanks for the responses! -Brandon here is my site: floridaoutdoorsrv.com I welcome any advice or input!
Technical SEO | | floridaoutdoorsrv0 -
Index page 404 error
Crawl Results show there is 404 error page which is index.htmk **it is under my root, ** http://mydomain.com/index.htmk I have checked my index page on the server and my index page is index.HTML instead of index.HTMK. Please help me to fix it
Technical SEO | | semer0 -
I am Posting an article on my site and another site has asked to use the same article - Is this a duplicate content issue with google if i am the creator of the content and will it penalize our sites - or one more than the other??
I operate an ecommerce site for outdoor gear and was invited to guest post on a popular blog (not my site) for a trip i had been on. I wrote the aritcle for them and i also will post this same article on my website. Is this a dup content problem with google? and or the other site? Any Help. Also if i wanted to post this same article to 1 or 2 other blogs as long as they link back to me as the author of the article
Technical SEO | | isle_surf0 -
Panda Update Question - Syndicated Content Vs Copied Content
Hi all, I have a question on copied content and syndicated content - Obviously copying content directly form another website is a big no no, but wanted to know how Google views syndicated content and if it views this differently? If you have syndicated content on your website, can you penalised from the lastest Panda update and is there a viable solutiion to address this? Mnay thanks Simon
Technical SEO | | simonsw0