Content from Another Site
-
Hi there -
I have a client that says they'll be "serving content by retrieving it from another URL using loadHTMLFile, performing some manipulations on it, and then pushing the result to the page using saveHTML()." Just wondering what the SEO implications of this will be. Will search engines be able to crawl the retrieved content? Is there a downside (I'm assuming we'll have some duplicate content issues)?
Thanks for the help!!
-
Hi,
Are you referring to PHP functions there? If so, content will be rendered server side and thus Google will have no problems crawling it, unlike some websites with JavaScript dependencies (not all).
Regarding duplicate content issues, Donald Silvernail is absolute correct in that using a cross domain canonical is undoubtedly best practice:
Rand has done an excellent White Board Friday on it, which explains it here: https://moz.com/blog/cross-domain-rel-canonical-seo-value-cross-posted-content
Hope this helps!
Nick
-
You would definitely have to set the canonical link to the original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Client wants to repackage in-depth content as PowerPoint files and embed on site. SEO implications?
Hi, I've a client who is planning to build out "courses" for their site. Their ultimate goal is to have videos (which will have transcriptions) but since the videos are not yet ready they want to launch with the content in PowerPoint format instead. Thing is, the pages they have now are really good content/in-depth. In short it seems videos are Phase 2, so their Phase 1 preference is to take all their courses content and put them in PowerPoint slides and add them to their web site. While I understand standalone files like PDFs and PPTs can be indexable, my recollection is that embedded slides are not (like SlideShare). Is that correct? My worry is that by taking this content and reformatting it into PowerPoints will hurt their site instead of helping. Any insight is appreciated!
Technical SEO | | CR-SEO0 -
Deleted/Merged Content on Site Migration
Hey Moz Community! Looking for some input on a site migration. When redirecting some old pages that aren't going to be moved over to the new site, do you prefer to redirect to a homepage (or similar page) or to throw up a 404/410 on the new site? What have you found works best?
Technical SEO | | iSTORM-New-Media1 -
Image centric site and duplicate content issues
We have a site that has very little text, the main purpose of the site is to allow users to find inspiration through images. 1000s of images come to us each week to be processed by our editorial team, so as part of our process we select a subset of the best images and process those with titles, alt text, tags, etc. We still host the other images and users can find them through galleries that link to the process and unprocessed image pages. Due to the lack of information on the unprocessed images, we are having lots of duplicate content issues (The layout of all the image pages are the same, and there isn't any unique text to differentiate the pages. The only changing factor is the image itself in each page) Any suggestions on how to resolve this issue, will be greatly appreciated.
Technical SEO | | wedlinkmedia0 -
Site maintenance and crawling
Hey all, Rarely, but sometimes we require to take down our site for server maintenance, upgrades or various other system/network reasons. More often than not these downtimes are avoidable and we can redirect or eliminate the client side downtime. We have a 'down for maintenance - be back soon' page that is client facing. ANd outages are often no more than an hour tops. My question is, if the site is crawled by Bing/Google at the time of site being down, what is the best way of ensuring the indexed links are not refreshed with this maintenance content? (ie: this is what the pages look like now, so this is what the SE will index). I was thinking that add a no crawl to the robots.txt for the period of downtime and remove it once back up, but will this potentially affect results as well?
Technical SEO | | Daylan1 -
I am Posting an article on my site and another site has asked to use the same article - Is this a duplicate content issue with google if i am the creator of the content and will it penalize our sites - or one more than the other??
I operate an ecommerce site for outdoor gear and was invited to guest post on a popular blog (not my site) for a trip i had been on. I wrote the aritcle for them and i also will post this same article on my website. Is this a dup content problem with google? and or the other site? Any Help. Also if i wanted to post this same article to 1 or 2 other blogs as long as they link back to me as the author of the article
Technical SEO | | isle_surf0 -
Pros & Cons of deindexing a site prior to launch of a new site on the same domain.
If you were launching a new website to completely replace an older existing site on the same domain, would there be any value in temporarily deindexing the old site prior to launching the new site? Both have roughly 3000 pages, will launch on the same domain but have a completely new url structure and much better optimized for the web. Many high ranking pages will be redirected with 301 to the corresponding new page. I believe the hypothesis is this would eliminate a mix of old & new pages from sharing space in the serps and the crawlers are more likely to index more of the new site initially. I don't believe this is a great strategy, on the other hand I see some merit to the arguments for it.
Technical SEO | | medtouch0 -
Duplicate content?
I have a question regarding a warning that I got on one of my websites, it says Duplicate content. I'm canonical url:s and is also using blocking Google out from pages that you are warning me about. The pages are not indexed by Google, why do I get the warnings? Thanks for great seotools! 3M5AY.png
Technical SEO | | bnbjbbkb0 -
Canonical on ecommerce site
I have read tons of guides about canonical implementaiton but still am confused about how I should best use it. On my site with tens of thousands of urls and thousands of afiiliates and shopping networks sending traffic, is it smart to simply add the tag to every page and redirect to the same url. In doing this would that solve the problem of a single page having many different entrances with different tracking codes? Is there a better way to handle this? Also is there any potential problems with rolling out the tag to all pages if they are simply refrencing themselves in the tag? Thanks in advance.
Technical SEO | | Gordian0