Content from Another Site
-
Hi there -
I have a client that says they'll be "serving content by retrieving it from another URL using loadHTMLFile, performing some manipulations on it, and then pushing the result to the page using saveHTML()." Just wondering what the SEO implications of this will be. Will search engines be able to crawl the retrieved content? Is there a downside (I'm assuming we'll have some duplicate content issues)?
Thanks for the help!!
-
Hi,
Are you referring to PHP functions there? If so, content will be rendered server side and thus Google will have no problems crawling it, unlike some websites with JavaScript dependencies (not all).
Regarding duplicate content issues, Donald Silvernail is absolute correct in that using a cross domain canonical is undoubtedly best practice:
Rand has done an excellent White Board Friday on it, which explains it here: https://moz.com/blog/cross-domain-rel-canonical-seo-value-cross-posted-content
Hope this helps!
Nick
-
You would definitely have to set the canonical link to the original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content from long Site Title
Hello! I have a number of "Duplicate Title Errors" as my website has a long Site Title: Planit NZ: New Zealand Tours, Bus Passes & Travel Planning. Am I better off with a short title that is simply my website/business name: Planit NZ My thought was adding some keywords might help with my rankings. Thanks Matt
Technical SEO | | mkyhnn0 -
New SEO manager needs help! Currently only about 15% of our live sitemap (~4 million url e-commerce site) is actually indexed in Google. What are best practices sitemaps for big sites with a lot of changing content?
In Google Search console 4,218,017 URLs submitted 402,035 URLs indexed what is the best way to troubleshoot? What is best guidance for sitemap indexation of large sites with a lot of changing content? view?usp=sharing
Technical SEO | | Hamish_TM1 -
Duplicate Content Reports
Hi Dupe content reports for a new client are sjhowing very high numbers (8000+) main of them seem to be for sign in, register, & login type pages, is this a scenario where best course of action to resolve is likely to be via the parameter handling tool in GWT ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Removing links from another site
Hello, Some site that I have never been able to access as it is always down has over 3,000 links to my website. They disappeared the other week and our search queries dramatically improved but now they are back again in Google Webmaster and we have dropped again.I have contacted the site owner and got no response and I have also put in a removal form (though I am not sure this fits for that) and asked Google to remove as they have been duplicating our content also. It was in my pending section but has now disappeared.This links are really damaging our search and the site isnt even there. Do I have to list all 3,000 links in the link removal to Google or is there another way I can go about telling them the issue.Appreciate any help on this
Technical SEO | | luwhosjack0 -
What would you do if a site's entire content is on a subdomain?
Scenario: There is a website called mydomain.com and it is a new domain with about 300 inbound links (some going to the product pages and categories), but they have some high trust links The website has categories a, b, c etc but they are all on a subdomain so instead of being mydomain.com/categoryA/productname the entire site's structure looks like subdomain.mydomain.com/categoryA/productname Would you go to the effort of 301ing the subdomain urls to the correct url structure of mydomain.com/category/product name, or would you leave it as it is? Just interested as to the extent of the issues this could cause in the future and if this is something worth resolving sooner than later.
Technical SEO | | Kerry220 -
The Bible and Duplicate Content
We have our complete set of scriptures online, including the Bible at http://lds.org/scriptures. Users can browse to any of the volumes of scriptures. We've improved the user experience by allowing users to link to specific verses in context which will scroll to and highlight the linked verse. However, this creates a significant amount of duplicate content. For example, these links: http://lds.org/scriptures/nt/james/1.5 http://lds.org/scriptures/nt/james/1.5-10 http://lds.org/scriptures/nt/james/1 All of those will link to the same chapter in the book of James, yet the first two will highlight the verse 5 and verses 5-10 respectively. This is a good user experience because in other sections of our site and on blogs throughout the world webmasters link to specific verses so the reader can see the verse in context of the rest of the chapter. Another bible site has separate html pages for each verse individually and tends to outrank us because of this (and possibly some other reasons) for long tail chapter/verse queries. However, our tests indicated that the current version is preferred by users. We have a sitemap ready to publish which includes a URL for every chapter/verse. We hope this will improve indexing of some of the more popular verses. However, Googlebot is going to see some duplicate content as it crawls that sitemap! So the question is: is the sitemap a good idea realizing that we can't revert back to including each chapter/verse on its own unique page? We are also going to recommend that we create unique titles for each of the verses and pass a portion of the text from the verse into the meta description. Will this perhaps be enough to satisfy Googlebot that the pages are in fact unique? They certainly are from a user perspective. Thanks all for taking the time!
Technical SEO | | LDS-SEO0 -
Google caching meta tags from another site?
We have several sites on the same server. On the weekend we relocated some servers, changing IP address. A client has since noticed something freaky with the meta tags. 1. They search for their companyname, and another site from the same server appears in position 1. It is completely unrelated, has never happened before, and the company name is not used in any incoming text links. Eg search for company1 on Google. Company1.com.au appears at position 2, but at position1 is school1.com.au. The words company1 don't appear anywhere on the site. I've analysed all incoming links with a gazillion tools, and can't find any link text of company1, linking to school1. 2. Even more freaky, searching for company1.com.au at Google. The results at Google in position 1 for the last three days has been: Meta Title for school1 (but hovering/clicking actual goes to URL for company1)
Technical SEO | | ozgeekmum
Meta Description for school1
URL for company1.com.au Clicking on the cached copy of result1, it shows a cached version of school1 taken on March 18. Today is 29 March. Logically we are trying to get Google to spider both sites again quickly. We've asked the clients to update their home pages. Resubmitted xml sitemaps. Checked the HTTP status codes - both are happily returning 200s. Different cookies. I found another instance on a forum: http://webmasters.stackexchange.com/questions/10578/incorrect-meta-information-in-google Any ideas?0