Ranking for our member's company names without giving them all away!

sssrpm

Hi,

We have a directory of 25,000 odd companies who use our site.

We have a strong PR site and want to rank a page for each company name. Some initial testing on one or two company names brings us to #2 after the company's own web site in the format: "Company Name Reviews and Feedback" - so it works well.

We want to do this for all 25,000 of our members, however we do not wish to make it easy for our competitors to scrape through our member database!!

e.g. using: www.ourdomain.com/randomstring/company-name-(profile).php

unfortunately with the above performing a search on google for site:domain.com/()/()(profile).php

would bring up all records.

Are there any tried and tested ways of achieving what we're after here?

Many Thanks.

RyanKent

Bottom line, you cannot make data available online without offering a means for a user to grab that data.

You said you "don't wish to make it easy" so I will share some ideas:

EGOL's suggestion is good and not that hard to implement. I am not sure if your site requires registration but you can set it up so guests can view a maximum of ?20 member pages or whatever amount you deem to be a reasonable number.
There are more complicated methods by which you can establish a script that will block any IP or user who pulls too many pages too quickly.
The real challenge is your sitemap. If all that is required is the company's name, your sitemap is all someone needs. In this case there is simply nothing I can think of you can do.
If the sitemap isn't a challenge, another idea is to present the data in a method that is not easy to read. You can leave the description information in HTML but present the company name in Flash, for example.

Bottom line, if you want to rank well, the site has to be easy to crawl. If the crawl data offers enough information for others to steal, there is simply no reasonable method that can be used to prevent automated tools from grabbing it.

EGOL

If you have links into all of these 25000 pages then people and robots will be able to find them.

If you want to keep robot scrapers out then you can use a white list of robots that allows the robots of search engines and other allowed automated visitors in but instructs others to keep out. No guarantees that they will not find a way in with this but it might help. Human scraping will still get through.

You could probably also devise a way to throttle the number of pageviews per visitor or per IP but that would take some creative programming.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Ranking for our member's company names without giving them all away!

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Re: Inbound Links. Whether it's HTTP or HTTPS, does it still go towards the same inbound link count?

Help, no organic traffic recovery after new site launch (it's been 6 months)!

Something happened within the last 2 weeks on our WordPress-hosted site that created "duplicates" by counting www.company.com/example and company.com/example (without the 'www.') as separate pages. Any idea what could have happened, and how to fix it?

Silo Architecture - need an expert's advice

Wordpress.com content feeding into site's subdomain, who gets SEO credit?

Soft 404's from pages blocked by robots.txt -- cause for concern?

Pinging SE's - is this spam?

Should I 301 Poorly Worded URL's which are indexed and driving traffic