Help finding website content scraping
-
Hi,
I need a tool to help me review sites that are plagiarising / directly copying content from my site. But tools that I'm aware, such as Copyscape, appear to work with individual URLs and not a root domain. That's great if you have a particular post or page you want to check. But in this case, some sites are scraping 1000s of product pages. So I need to submit the root domain rather than an individual URL.
In some cases, other sites are being listed in SERPs above or even instead of our site for product search terms. But so far I have stumbled across this, rather than proactively researched offending sites.
So I want to insert my root domain & then for the tool to review all my internal site pages before providing information on other domains where an individual page has a certain amount of duplicated copy. Working in the same way as Moz crawls the site for internal duplicate pages - I need a list of duplicate content by domain & URL, externally that I can then contact the offending sites to request they remove the content and send to Google as evidence, if they don't.
Any help would be gratefully appreciated.
Terry
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving content form Non-performing site to performing site - wihtout 301 Redirection
I have 2 different websites: one have good amount of traffic and another have No Traffic at all. I have a website that has lots of valuable content But no traffic. And I want to move the content of non-performing site to performing site. (Don't want to redirect) My only concern is duplicate content. I was thinking of setting the pages to "noindex" on the original website and wait until they don't appear in Google's index. Then I'd move them over to the performing domain to be indexed again. So, I was wondering If it will create any copied content issue or not? What should i have to take care of when I am going to move content from one site to another?
White Hat / Black Hat SEO | | HuptechWebseo0 -
"Google chose different canonical than user" Issue Can Anyone help?
Our site https://www.travelyaari.com/ , some page are showing this error ("Google chose different canonical than user") on google webmasters. status message "Excluded from search results". Affected on our route page urls mainly. https://www.travelyaari.com/popular-routes-listing Our canonical tags are fine, rel alternate tags are fine. Can anyone help us regarding why it is happening?
White Hat / Black Hat SEO | | RobinJA0 -
Question RE: Links in Headers, Footers, Content, and Navigation
This question is regarding this Whiteboard Friday from October 2017 (https://moz.com/blog/links-headers-footers-navigation-impact-seo). Sorry that I am a little late to the party, but I wanted to see if someone could help out. So, in theory, if header links matter less than in-content links, and links lower on the page have their anchor text value stripped from them, is there any point of linking to an asset in the content that is also in the header other than for user experience (which I understand should be paramount)? Just want to be clear.Also, if in-content links are better than header links, than hypothetically an industry would want to find ways to organically link to landing pages rather than including that landing page in the header, no? Again, this is just for a Google link equity perspective, not a user experience perspective, just trying to wrap my head around the lesson. links-headers-footers-navigation-impact-seo
White Hat / Black Hat SEO | | 3VE0 -
Duplicate content site not penalized
Was reviewing a site, www.adspecialtyproductscatalog.com, and noted that even though there are over 50,000 total issues found by automated crawls, including 3000 pages with duplicate titles and 6,000 with duplicate content this site still ranks high for primary keywords. The same essay's worth of content is pasted at the bottom of every single page. What gives, Google?
White Hat / Black Hat SEO | | KenSchaefer0 -
SERPs Help
Hey Mozzers, Please can someone advise? I manage the on-line content for an estate of Gyms in the UK. We had an existing gym location in Birmingham - www.nuffieldhealth.com/gyms/birmingham and 5 months ago we opened a new location in Birmingham - www.nuffieldhealth.com/gyms/birmingham-central. The 2 pages have different in-page content, different H1's, different title tags, different citations in page both have a few back links from different root domains, however the 2nd page (birmingham-central) does not rank in the top 50 results even though our domain is strong that the vast majority of results? Our original page (/gyms/birmingham) also slipped from page 1 in SERPs to the bottom of page 2 when the second Birmingham gym page was deployed?? I am guessing Google does not know which page to serve in SERPs, bud i am at a lose as to how to fix this issue. Can anyone please advise?? Regards Ben
White Hat / Black Hat SEO | | Bendall0 -
Image Optimization & Duplicate Content Issues
Hello Everyone, I have a new site that we're building which will incorporate some product thumbnail images cut and pasted from other sites and I would like some advice on how to properly manage those images on our site. Here's one sample scenario from the new website: We're building furniture and the client has the option of selecting 50 plastic laminate finish options from the Formica company. We'll cut and paste those 50 thumbnails of the various plastic laminate finishes and incorporate them into our site. Rather than sending our website visitors over to the Formica site, we want them to stay put on our site, and select the finishes from our pages. The borrowed thumbnail images will not represent the majority of the site's content and we have plenty of our own images and original content. As it does not make sense for us to order 50 samples from Formica & photograph them ourselves, what is the best way to handle to issue? Thanks in advance, Scott
White Hat / Black Hat SEO | | ccbamatx0 -
What are legit ways to raise up you're ranking for a new website?
I have a wallpaper website that i just made and bought a template that looks fine for the site so far for a month now, and i wanted to know what steps i cant take to better rank my site and build some traffic along the way. I use only specific directories, not sure how to get a press release done and also link back to other sites from pages that get a decent amount of traffic where i can leave a link to it, of course not leaving any type of spammy looking comments. This is the site i am working on right now, freehdwallpapers.be I have linked back from a few sites already, i look at the alexa rank if it will show a number at one point, the sites worth is still pretty low, and also i have added social networks on the site which has gained a number of followers to this day, so i got work to do still. I just don't want to go on about it the wrong way and get penalized by google.
White Hat / Black Hat SEO | | 1080HDWallpapers0 -
User comments with page content or as a separate page?
With the latest Google updates in both cracking down on useless pages and concentrating on high quality content, would it be beneficial to include user posted comments on the same page as the content or a separate page? Having a separate page with enough comments on it would he worth warranting, especially as extra pages add extra pagerank but would it be better to include them with the original article/post? Your ideas and suggestions are greatly appreciated.
White Hat / Black Hat SEO | | Peter2640