Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
No Index PDFs
-
Our products have about 4 PDFs a piece, which really inflates our indexed pages. I was wondering if I could add REL=No Index to the PDF's URL? All of the files are on a file server, so they are embedded with links on our product pages. I know I could add a No Follow attribute, but I was wondering if any one knew if the No Index would work the same or if that is even possible. Thanks!
-
The files aren't duplicate. I am familiar with using the XRobots tag. I was really just curious if my theory would work.
Thanks for all your input.
-
Hi Monica,
I presume you already check all the options before posting this question. I have concluded this by seeing your others posts/reply in this community.

Now here is my answer
To prevent your PDF file (or any non HTML file) from being listed in search results, the only way is to use the HTTP X-Robots-Tag response header, e.g.:
X-Robots-Tag: noindex
robots.txt does not prevent your page from being listed in search results.
What it does is stop the bot from crawling your page, but if a third party links to your PDF file from their website, your page will still be listed.
If you stop the bot from crawling your page using robots.txt, it will not have the chance to see the X-Robots-Tag: noindex response tag. Therefore, never ever ever disallow a page in robots.txt if you employ the X-Robots-Tag header.
I hope it helps but not very sure.

Thanks
-
-
If you want to deindex all PDF files, I recommend using the x-robots-tag in .htaccess - https://yoast.com/x-robots-tag-play/
-
If the PDFs are pdf versions of existing pages, I would set canonicals to point to the URL you do want indexed (#2 on http://moz.com/blog/htaccess-file-snippets-for-seos )
-
-
If the pdf's are in a separate folder on your site - you could mark that folder as noindex in robots.txt
As far as I know, it's not possible to add a noindex to a link.
rgds
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keywords are indexed on the home page
Hello everyone, For one of our websites, we have optimized for many keywords. However, it seems that every keyword is indexed on the home page, and thus not ranked properly. This occurs only on one of our many websites. I am wondering if anyone knows the cause of this issue, and how to solve it. Thank you.
Technical SEO | | Ginovdw1 -
Adding Schema and No index tags via GTM
If we were to deploy schema and noindex tags to our website via Google tag manager, would these tags be viewed and respected by other search engines?
Technical SEO | | GregLB0 -
Why images are not getting indexed and showing in Google webmaster
Hi, I would like to ask why our website images not indexing in Google. I have shared the following screenshot of the search console. https://www.screencast.com/t/yKoCBT6Q8Upw Last week (Friday 14 Sept 2018) it was showing 23.5K out 31K were submitted and indexed by Google. But now, it is showing only 1K 😞 Can you please let me know why might this happen, why images are not getting indexed and showing in Google webmaster.
Technical SEO | | 21centuryweb0 -
Google not Indexing images on CDN.
My URL is: https://bit.ly/2hWAApQ We have set up a CDN on our own domain: https://bit.ly/2KspW3C We have a main xml sitemap: https://bit.ly/2rd2jEb and https://bit.ly/2JMu7GB is one the sub sitemaps with images listed within. The image sitemap uses the CDN URLs. We verified the CDN subdomain in GWT. The robots.txt does not restrict any of the photos: https://bit.ly/2FAWJjk. Yet, GWT still reports none of our images on the CDN are indexed. I ve followed all the steps and still none of the images are being indexed. My problem seems similar to this ticket https://bit.ly/2FzUnBl but however different because we don't have a separate image sitemap but instead have listed image urls within the sitemaps itself. Can anyone help please? I will promptly respond to any queries. Thanks
Technical SEO | | TNZ
Deepinder0 -
My video sitemap is not being index by Google
Dear friends, I have a videos portal. I created a video sitemap.xml and submit in to GWT but after 20 days it has not been indexed. I have verified in bing webmaster as well. All videos are dynamically being fetched from server. My all static pages have been indexed but not videos. Please help me where am I doing the mistake. There are no separate pages for single videos. All the content is dynamically coming from server. Please help me. your answers will be more appreciated................. Thanks
Technical SEO | | docbeans0 -
Question on noscript tags and indexing
If I have a <noscript>tag on every page of my website with the same sentence over and over saying something to the effect of "Sorry our site uses Javascript, please enable javascript for the full site experience.", Webmaster Tools will tell me that one of the most common words on my site is "Javascript".</p> <p>Is this something to be concerned about from an SEO perspective? My site is obviously not about Javascript and I don't want to dilute my page's topic or authority by repeating words that are not relevant to the topic of my site.</p> <p>Thanks!</p></noscript>
Technical SEO | | IrvCo_Interactive0 -
Duplicate content problem from an index.php file
Hi One of my sites is flagging a duplicate content problem which is affecting the search rankings. The duplicate problem is caused by http://www.mydomain.com/index.php which has a page rank of 26 How can I sort the duplicate content problem, as the main page should just be http://www.mydomain.com which has a page rank of 42 and is the stronger page with stronger links etc Many Thanks
Technical SEO | | ocelot0 -
Block a sub-domain from being indexed
This is a pretty quick and simple (i'm hoping) question. What is the best way to completely block a sub domain from getting indexed from all search engines? One item i cannot use is the meta "no follow" tag. Thanks! - Kyle
Technical SEO | | kchandler0