Welcome to the Q&A Forum

LoganRay

I highly recommend buying the license for Screaming Frog, at $100/year, you won't find a more valuable SEO tool for the money. You won't find a free (and trustworthy) that will crawl a site that large.

LoganRay

Correction on my earlier statement, reviews/ratings only apply to products.

However, you can use the organization markup if you've got that info on the homepage. This schema generator will build the code you'll need: https://webcode.tools/microdata-generator/organization.

LoganRay

Do you have reviews on your homepage? On most sites, reviews are relative to specific products or services rather than the whole company. You would need to have an aggregate review score displayed on your homepage in order to be able to mark it up with schema. Schema is meant to identify data types that are visible on the site, it's not like meta data where it runs in the background for search engines to see and not people.

LoganRay

Glad to help, Luke!

LoganRay

Hi,

What kind of schema are we talking about here? There are tons of different data types you can mark up, most of which would be better suited for interior pages.

LoganRay

Disallow: /?* is the same thing as Disallow:/?, since the asterisk is a wildcard, both of those disallows prevent any URL that begins with /? from being crawled.

And yes, it is incredibly easy to disallow the wrong thing! The robots.txt tester in Search Console (under the Crawl menu) is very helpful for figuring out what a disallow will catch and what it will let by. I highly recommend testing any new disallows there before releasing them into the wild.

LoganRay

Disallow: /*?

This disallow literally says to crawlers 'if a URL starts with a slash (all URLs) and has a parameter, don't crawl it'. The * is a wildcard that says anything between / and ? is applicable to the disallow.

It's very easy to disallow the wrong this especially in regards to parameters, for this reason I always do these 2 things rather than using robots.txt:

Set the purpose of each parameter in Search Console - Go to Crawl > URL Parameters to configure for your site
Self-referring canonicals - most people disallow URLs with parameters in robots.txt to prevent indexing, but this only prevents crawling. A self-referring canonical pointing to the root level of that URL will prevent indexing or URLs with parameters.

Hope that's helpful!

LoganRay

Hi Luke,

You are correct that this was done to block URLs with parameters. However, since there's no wildcard (the asterisk) before the folder name, the URL would have to start with /french-wines/. This disallow is really only preventing crawling on the single URL www.yoursite.com/french-wines/ with any parameters appended.

LoganRay

Rhys,

Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.

Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en

LoganRay

Hi Tyler,

Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.

LoganRay

Honestly, search engines aren't that particular about URL structure, it is important, but not to the degree where one of these two examples is going to make or break your SEO campaign. That being said, I usually set up my URLs with the broadest category in the first folder, and get more granular from there. In your first example, the assessment and treatment folders make more sense to me, since there's additional content that could live in each of those respective folders. In your second example, there's less opportunity for future content to live in those folders.

LoganRay

I've had some run-ins with case-sensitive URLs in the past and it drives me crazy, I don't understand why CMSs still do that!! While canonical tags are a perfectly fine way to handle this, there's a better solution. Brian Love wrote a great blog post on how to do server-side URL lower-casing. I've used this on a few sites and it works great.

LoganRay

I'm sure most of you have heard about this startup, RankScience, that has big ambitions to disrupt the SEO industry with their automated (I know I know...the word 'automated' and 'SEO' in the same sentence!!!) optimization software. Their claim is that by running thousands of congruent A/B tests on your site, they can maximize rankings and organic traffic.

Initially my thoughts were "oh crap, there goes my (and a lot of other people's) career". But then I started thinking about it a bit more and realized a couple things. First, software can't replace a face-to-face client meeting. Being in an agency world as most of us are, client interactions are vital to a sustained partnership. Second, someone is going to have to understand what this software does, configure it, and monitor it, and I'm ok with that being part of my job if that's how the industry shifts. Third, and most importantly, in theory this software has the capability to reverse engineer search algorithms. If they had the data of 10,000 websites using their platform and are collecting data on what works and what doesn't, it's only a matter of time before they can pick apart the algorithm piece by piece to figure out exactly how it works. Google is obviously not going to like that very much and will almost certainly right the ship.

That's my 2 cents, looking forward to what your thoughts are on RankScience and the future of our industry.

LoganRay

Oh yea, I missed that. That's very strange, not sure how to explain that one!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

LoganRay

@LoganRay

Posts made by LoganRay

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved