Welcome to the Q&A Forum

ThomasHarvey

Something i'd never thought about: http://www.searchenginepeople.com/blog/anchor-links.html

So looks to not have an impact according to that article.

ThomasHarvey

An answer from one of the founders (no longer in the company) who I have on skype, I stripped out a couple of things from the chat (my posts and a couple of irrelevant bits):

[16:08:31] A Founder: that's a difficult question to answer really
[16:08:43] A Founder: is it worth £80 a year ?
[16:08:51] A Founder: I would doubt it to be honest
[16:10:07] A Founder: apparently I cant reply anyway
[16:10:37] A Founder: I aint a pro
[16:11:09] Me: If you want to write it out I can reply for you
[16:12:48] A Founder: well if you want to tell him scoot also offer a free listing and I am happy to talk to him but cant as I aint a pro
[16:13:08] A Founder: and yes I was one of the 3 founders
[16:13:27] A Founder: ya they now bulk list you
[16:13:46] A Founder: 500+ directories, that to me sounds so spammy
[16:14:13] A Founder: I would claim the free listing and say thanks but no thanks to the rest
[16:17:36] A Founder: but feel free to post my thoughts if you want

ThomasHarvey

Hi Aleyda,

The reason we want to merge to one domain is because the higher ups want to, so that's what is happening.

Anyway, thank you for your in depth reply unfortunately having suggested that the site merge and https goes live at the same time development won't be able to achieve that.

I do only have one question left really, if a page doesn't have an equivalent on one site. What does the rel alternate have to be, is it left blank or does it go to the main language url?

Thanks,

Tom

ThomasHarvey

It'll vary because of the crawlers, each company will have a different amount of sites in their database, and they will also have a different process to get their data live in their console. Honestly I would take each with a grain of salt and just try and use them as a rough guide.

ThomasHarvey

Answered my own questions:

https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt?csw=1#file-format

"A maximum file size may be enforced per crawler. Content which is after the maximum file size may be ignored. Google currently enforces a size limit of 500kb."

ThomasHarvey

We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately)

Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down.

Does anybody have any experience with a robots.txt of that size?

ThomasHarvey

Hello,

I've never had to deal with an international site before, let alone a site merge.

These are two large sites, we've got a few smaller old sites that are currently redirecting to the main site (UK). We are looking at moving all the sites to the .com domain. We are also currently not using SSL (on the main pages, we are on the checkout). We also have a m.domain.com site. Are there any good guides on what needs to be done?

My current strategy would be:

Convert site to SSL.
Mobile site and desktop site must be on the same domain.
Start link building to the .com domain now (weaker link profile currently)

What's the best way of handling the domains and languages? We're currently using a .tv site for the UK and .com for the US.

I was thinking, and please correct me if i'm wrong, that we move the US site from domain.com to domain.com/us/ and the domain.tv to domain.com/en/

Would I then reference these by the following:

What would we then do with the canonicals? Would they just reference their "local" version?

Any advice or articles to read would really be appreciated.

ThomasHarvey

I don' think Martijn's statement is quite correct as I have made different experiences in an accidental experiment. Crawling is not the same as indexing. Google will put pages it cannot crawl into the index ... and they will stay there unless removed somehow. They will probably only show up for specific searches, though

Completely agree, I have done the same for a website I am doing work with, ideally we would noindex with meta robots however that isn't possible. So instead we added to the robots.txt, the number of indexed pages have dropped, yet when you search exactly it just says the description can't be reached.

So I was happy with the results as they're now not ranking for the terms they were.

ThomasHarvey

I would use the search console "fetch as google" it can take a few weeks for Google to crawl and update your site in the serps. Usually I leave 4-6 weeks for any change (on a small site) with larger sites normally being crawled more frequently. When you do cache:yourdomain.com in google chrome, is the time listed sooner than you did the changes?

ThomasHarvey

I have had something similar, this is response I received:

You don’t have canonical tags on the URL and that’s expected.

On pages where BVSEO is implemented, canonical tags must be updated or removed when the product contains more than one page (more than eight) of reviews. BVSEO paginates the product page so all reviews are in the search engines’ index. Canonical tags that point away from a pagination URL will cause search engines to ignore the paginated content.

When any of the BVSEO pagination parameters are present (bvstate, bvrrp, bvqap, bvsyp, bvpage), do one of the following:

•Remove the canonical tag. This is the most common, recommended solution.

•Append the "name=value" pair to the canonical URL.

ThomasHarvey

We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following:

Checkout
Basket

Then possibly:

Price
Theme
Sortby
other misc filters.

What do you include?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

ThomasHarvey

@ThomasHarvey

Posts made by ThomasHarvey

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved