Welcome to the Q&A Forum

Keszi

Hi James,

So far as I can see you have the following architecture:

job posting: https://www.pkeducation.co.uk/job/post-name/
jobs listing page: https://www.pkeducation.co.uk/jobs/

Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.

I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/

Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").

Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.

Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).

Good luck!

Keszi

The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).

Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test

Keszi

Hi Rachel,

Regarding the language code in the URL, you can leave it (page**-uk**.html,page**-es**.html, etc.), but maybe it would be an idea of having a translated page url for each language. For example:

This would serve a little bit better than the previous version, where you would have:

Keszi

Sorry Richard, but using noindex with canonical link is not quite a good practice.

It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html

Keszi

Hi James,

Regarding the robots.txt syntax:

Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **

For example: domain.com**/jobs/?**sort-by=... will be blocked

If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=

My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.

BTW. I don't know why you would block your pagination. There are other optimal implementations.

And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.

Also, make sure other technical implementations are not stopping your job posting pages from being indexed.

Keszi

Let me know how it turns out. If the problem persist, I'm glad to help good luck!

Keszi

Hey,

Can you point out an example URL? (if you don't want to disclose the website URL in here, you can do it via a personal message). This way we can debug an exact URL and not just a theory.

Regarding blocking via robots.txt: it is never a good idea to block a search engine from URLs you want to deindex. This way the Google crawlers won't grab and process the data, and you will have your URLs in the search index.

Just check: https://support.google.com/webmasters/answer/6062608?hl=en

"While Google won't crawl or index the content blocked by robots.txt, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results."

In case of 301 redirects (make sure you are not using 302), if the crawler can access the page, you should have the old URL removed from the index.

Keszi

Hi,

Yesterday there was a Mozscape Index update. (https://moz.com/products/api/updates)

More than possible you can see the effect of that update. It is enough if your current linking domains have their DA drop, and it can effect your DA value. But as John mentioned above, if the rankings have not been effected, I would not worry about it.

Keep up the good job!

Keszi

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Keszi

@Keszi

Posts made by Keszi

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved