Ranking for our member's company names without giving them all away!
-
Hi,
We have a directory of 25,000 odd companies who use our site.
We have a strong PR site and want to rank a page for each company name. Some initial testing on one or two company names brings us to #2 after the company's own web site in the format: "Company Name Reviews and Feedback" - so it works well.
We want to do this for all 25,000 of our members, however we do not wish to make it easy for our competitors to scrape through our member database!!
e.g. using: www.ourdomain.com/randomstring/company-name-(profile).php
unfortunately with the above performing a search on google for site:domain.com/()/()(profile).php
would bring up all records.
Are there any tried and tested ways of achieving what we're after here?
Many Thanks.
-
Bottom line, you cannot make data available online without offering a means for a user to grab that data.
You said you "don't wish to make it easy" so I will share some ideas:
-
EGOL's suggestion is good and not that hard to implement. I am not sure if your site requires registration but you can set it up so guests can view a maximum of ?20 member pages or whatever amount you deem to be a reasonable number.
-
There are more complicated methods by which you can establish a script that will block any IP or user who pulls too many pages too quickly.
-
The real challenge is your sitemap. If all that is required is the company's name, your sitemap is all someone needs. In this case there is simply nothing I can think of you can do.
-
If the sitemap isn't a challenge, another idea is to present the data in a method that is not easy to read. You can leave the description information in HTML but present the company name in Flash, for example.
Bottom line, if you want to rank well, the site has to be easy to crawl. If the crawl data offers enough information for others to steal, there is simply no reasonable method that can be used to prevent automated tools from grabbing it.
-
-
If you have links into all of these 25000 pages then people and robots will be able to find them.
If you want to keep robot scrapers out then you can use a white list of robots that allows the robots of search engines and other allowed automated visitors in but instructs others to keep out. No guarantees that they will not find a way in with this but it might help. Human scraping will still get through.
You could probably also devise a way to throttle the number of pageviews per visitor or per IP but that would take some creative programming.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Over-optimizing Internal Linking: Is this real and, if so, what's the happy medium?
I have heard a lot about having a solid internal linking structure so that Google can easily discover pages and understand your page hierarchies and correlations and equity can be passed. Often, it's mentioned that it's good to have optimized anchor text, but not too optimized. You hear a lot of warnings about how over-optimization can be perceived as spammy: https://neilpatel.com/blog/avoid-over-optimizing/ But you also see posts and news like this saying that the internal link over-optimization warnings are unfounded or outdated:
Intermediate & Advanced SEO | | SearchStan
https://www.seroundtable.com/google-no-internal-linking-overoptimization-penalty-27092.html So what's the tea? Is internal linking overoptimization a myth? If it's true, what's the tipping point? Does it have to be super invasive and keyword stuffy to negatively impact rankings? Or does simple light optimization of internal links on every page trigger this?1 -
Avoiding duplicate content in manufacturer's [of single product] website
Hello, So I have read a lot of articles about duplicate content/ keyword canibalism/ competing with yourself, and so on. But none of these articles really fit to manufacturer website who produces one product. For example, lets say I make ceramic tiles, this means: Homepage: "Our tiles are the best tiles, we have numerous designs of tiles. We make them only from natural ceramic" Product list: "Here is a list of our tiles: Poesia tile, white tile, textured tile, etc" Page for each tile: Gallery: a bunch of images trying to prove that these tiles look best 🙂 Where to buy page: a map From what I understand this page is already doomed - it will not go well against larger retailers who don't focus only on tiles but they sell everything. This page is set to have a lot of duplicate content. But I hope I am wrong, can someone please make some suggestions how to do SEO on such a website where all pages are about the same thing? Any help would be much appreciated! Juris
Intermediate & Advanced SEO | | JurisBBB0 -
Change Google's version of Canonical link
Hi My website has millions of URLs and some of the URLs have duplicate versions. We did not set canonical all these years. Now we wanted to implement it and fix all the technical SEO issues. I wanted to consolidate and redirect all the variations of a URL to the highest pageview version and use that as the canonical because all of these variations have the same content. While doing this, I found in Google search console that Google has already selected another variation of URL as canonical and not the highest pageview version. My questions: I have millions of URLs for which I have to do 301 and set canonical. How can I find all the canonical URLs that Google has autoselected? Search Console has a daily quota of 100 or something. Is it possible to override Google's version of Canonical? Meaning, if I set a variation as Canonical and it is different than what Google has already selected, will it change overtime in Search Console? Should I just do a 301 to highest pageview variation of the URL and not set canonicals at all? This way the canonical that Google auto selected might get redirected to the highest pageview variation of the URL. Any advice or help would be greatly appreciated.
Intermediate & Advanced SEO | | SDCMarketing0 -
Community Discussion - What's the ROI of "pruning" content from your ecommerce site?
Happy Friday, everyone! 🙂 This week's Community Discussion comes from Monday's blog post by Everett Sizemore. Everett suggests that pruning underperforming product pages and other content from your ecommerce site can provide the greatest ROI a larger site can get in 2016. Do you agree or disagree? While the "pruning" tactic here is suggested for ecommerce and for larger sites, do you think you could implement a similar protocol on your own site with positive results? What would you change? What would you test?
Intermediate & Advanced SEO | | MattRoney2 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
Will adding 1000's of outbound links to just a few website impact rankings?
I manage a large website that hosts 1000's of business listings that comprise an area that covers 7 state counties. Currently a category page (such as lodging) hosts a group of listings which then link to it's own page. From these pages links are present directly to the business it represents. The client is proposing that we change all listings to link to the representative county website and remove the individual pages. This essentially would create 1000's of external links to 7 different websites and remove 1000's of pages from our site.
Intermediate & Advanced SEO | | Your_Workshop
Does anyone have thoughts on how adding 1000's of links (potentially upwards of 3000) to only 7 websites (that I would deem relevant links) would affect SEO? I know if 1000's of links are added pointing to 1000's of websites the site can be considered a link farm, but I can't find any info online that speaks of a case like this.0 -
No matter what I do, my website isn't showing up in search results. What's happening?
I've checked for meta-robots, all SEO tags are fixed, reindexed with google-- basically everything and it's not showing up. According to SEOMoz all looks fine, I am making a few fixes, but nothing terribly major. It's a new website, and i know it takes a while, but there is no movement here in a month. Any insights here?
Intermediate & Advanced SEO | | Wabash0 -
Ranking Ranking Factors!
When you look at the keyword analysis, you see the following ranking criteria: - | Page Authority | Page Linking Root Domains | Domain Authority | Root Domain Linking Root Domains | How do you rank the importance of each of these factors from 1-4? For example, PA, PLRD, RDLRD, DA Please explain. How many of these factors do you normally need to get within top 5?
Intermediate & Advanced SEO | | inhouseseo0