Ensuring Assets (PDFs, PowerPoint Files, Word Docs, etc.) are Indexable on Site
-
Hi there - I'm working on an educational site in which users will be able to search our repository of PDF articles, PowerPoint files, and so on through an on-site search engine. What is the best way to ensure each of these documents/assets are indexable by Google since they technically don't reside on an HTML page....they are just pulled up if the user searches for them? The site itself is just a few pages, but the files, articles, and videos in the repository are in the hundreds. Should I just name and tag them properly and make sure they're all included in an XML site map? Anything else suggested?
Thanks very much!
-
The more links a sitemap the it harder it is for people to follow but should be ok for search spiders.
-
Thanks for your response Chris! Good suggestion on the HTML sitemap. Any concerns if there are a couple of hundred links on this HTML site map page?
-
I would build 2 sitemaps for these files, 1 XML sitemap and 1 HTML sitemap, separate from the main sitemap and add these to Google WMT. The HTML Sitemap could also be used as a directory for visitors too.
Where possible link to the documents from the site too, this will increase the chances that the assets are indexed by Google.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My brand name has 2 words but Google only indexing as 1 word. Is there a fix?
Hi all...I'm at a loss. I've never had this happen. Google only shows pages of my site when I search the brand name as one word. When I Google the site as one word BrandBrand- it only shows my blog page and about us page plus Twitter and Facebook on page 1. The homepage does not show up at all. When I Google the site as two words Brand Brand - My Facebook page is on page 1 but nothing else. The homepage isn't showing up at all. When I search both words on Bing and Yahoo both are indexing it as two words and shows on page 1. Any ideas?
Technical SEO | | TexasBlogger0 -
Why is my site not being indexed?
Hi, I have performed a site:www.menshealthanswers.co.uk search on Google and none of the pages are being indexed. I do not have a "noindex" value on my robot tag This is what is in place: Any ideas? Jason
Technical SEO | | Jason_Marsh1230 -
Removing a staging area/dev area thats been indexed via GWT (since wasnt hidden) from the index
Hi, If you set up a brand new GWT account for a subdomain, where the dev area is located (separate from the main GWT account for the main live site) and remove all pages via the remove tool (by leaving the page field blank) will this definately not risk hurting/removing the main site (since the new subdomain specific gwt account doesn't apply to the main site in any way) ?? I have a new client who's dev area has been indexed, dev team has now prevented crawling of this subdomain but the 'the stable door was shut after the horse had already bolted' and the subdomains pages are on G's index so we need to remove the entire subdomain development area asap. So we are going to do this via the remove tool in a subdomain specific new gwt account, but I just want to triple check this wont accidentally get main site removed too ?? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Why did Google stop indexing my site?
Google used to crawl my site every few minutes. Suddenly it stopped and the last week it indexed 3 pages out of thousands. https://www.google.co.il/#q=site:www.yetzira.com&source=lnt&tbs=qdr:w&sa=X&ei=I9aTUfTTCaKN0wX5moCgAw&ved=0CBgQpwUoAw&bav=on.2,or.r_cp.r_qf.&fp=cfac44f10e55f418&biw=1829&bih=938 What could cause this to happen and how can I solve this problem? Thanks!
Technical SEO | | JillB20130 -
How Often is Site Crawled
Good morning- I saw some errors in my first crawl and immediately removed the pages from my website. I then re-created my XML sitemap and uploaded to Google. The question I have is will the site be crawled to recognize the changes in the next day or so? The pages were just placed on the site as test pages and never removed. The initial crawl that notified me it was done found the errors and were removed. Thanks for your help. Peter
Technical SEO | | VT_Pete0 -
Problem with indexing
Hello, we've changed our CMS recently, everything seems to work well, but for some reason google, and other crawlers can't see or index other pages than main. There is no restriction in robots, nor any other visible issue. Please help if you can. Website: http://www.design-glassware.com/
Technical SEO | | divan0 -
Https indexed - though a no index no follow tag has been added
Hi, The https-pages of our booking section are being indexed by Google. We added But the pages are still being indexed. What can I do to exclude these URL's from the Google index? Thank you very much in advance! Kind regards, Dennis Overbeek ACSI Publishing | dennis@acsi.eu
Technical SEO | | SEO_ACSI0 -
Google has not indexed my site in over 4 weeks, what's the problem?
We recently put in permanent redirects to our new url, but Google seems to not want to index the new url. There was no problems with the old url and the new url is brand new so should have no 'black marks' against it. We have done everything we can think off in terms of submitting site maps, telling google our url has changed in webmaster tools, mentioning the new url on social sites etc...but still nothing. It has been over 4 weeks now since we set up the redirects to the url, any ideas why Google seems to be choosing not to index it? Thanks
Technical SEO | | cewe0