Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I disable the indexing of tags in Wordpress?
Hi, I have a client that is publishing 7 or 8 news articles and posts each month. I am optimising selected posts and I have found that they have been adding a lot of tags (almost like using hashtags) . There are currently 29 posts but already 55 tags, each of which has its own archive page, and all of which are added to the site map to be indexed (https://sykeshome.europe.sykes.com/sitemap_index.xml). I came across an article (https://crunchify.com/better-dont-use-wordpress-tags/) that suggested that tags add no value to SEO ranking, and as a consequence Wordpress tags should not be indexed or included in the sitemap. I haven't been able to find much more reliable information on this topic, so my question is - should I get rid of the tags from this website and make the focus pages, posts and categories (redirecting existing tag pages back to the site home page)? It is a relatively new websites and I am conscious of the fact that category and tag archive pages already substantially outnumber actual content pages (posts and news) - I guess this isn't optimal. I'd appreciate any advice. Thanks wMfojBf
Intermediate & Advanced SEO | | JCN-SBWD0 -
Trying to get Google to stop indexing an old site!
Howdy, I have a small dilemma. We built a new site for a client, but the old site is still ranking/indexed and we can't seem to get rid of it. We setup a 301 from the old site to the new one, as we have done many times before, but even though the old site is no longer live and the hosting package has been cancelled, the old site is still indexed. (The new site is at a completely different host.) We never had access to the old site, so we weren't able to request URL removal through GSC. Any guidance on how to get rid of the old site would be very appreciated. BTW, it's been about 60 days since we took these steps. Thanks, Kirk
Intermediate & Advanced SEO | | kbates0 -
Redirect wordpress from /%post_id%/%postname%/ to /blog/%postname%/
Hi what is the code to redirect wordpress blog from site.com/%post_id%/%postname%/ to site.com/blog/%postname%/ We are moving the site to a new server and new url structure. Thanks in advance
Intermediate & Advanced SEO | | Taiger0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Redirecting index.html to the root
Hi, I was wondering if there is a safe way to consolidate link juice on a single version of a home page. I find incoming links to my site that link to both mysite.com/ and mysite.com/index.html. I've decided to go with mysite.com/ as my main and only URL for the site and now I'd like to transfer all link juice from mysite.com/index.html to mysite.com/
Intermediate & Advanced SEO | | romanbond
When i tried 301 redirect from index.html to the root it created an indefinite loop, of course. I know I can use a RewriteRule.., but will it transfer the juice?? Please help!7 -
XML Sitemap index within a XML sitemaps index
We have a similar problem to http://www.seomoz.org/q/can-a-xml-sitemap-index-point-to-other-sitemaps-indexes Can a XML sitemap index point to other sitemaps indexes? According to the "Unique Doll Clothing" example on this link, it seems possible http://www.seomoz.org/blog/multiple-xml-sitemaps-increased-indexation-and-traffic Can someone share an XML Sitemap index within a XML sitemaps index example? We are looking for the format to implement the same on our website.
Intermediate & Advanced SEO | | Lakshdeep0 -
Article/ Blog Post submissions
Hello All, I'm looking to perform a 'Standard' guest blog post link building tactic, but i'm a little unsure as where to start. Does anybody have a list/ guide to websites that accept guest posts? Preferably ones that are useful for SEO purposes, I have been link building for about 3 months now, but to be honest, most of these links are NoFollow, which isn't too great! Paul
Intermediate & Advanced SEO | | Paul_Tovey0 -
De-indexed Link Directory
Howdy Guys, I'm currently working through our 4th reconsideration request and just have a couple of questions. Using Link Detox (www.linkresearchtools.com) new tool they have flagged up a 64 links that are Toxic and should be removed. After analysing them further alot / most of them are link directories that have now been de-indexed by Google. Do you think we should still ask for them to be removed or is this a pointless exercise as the links has already been removed because its been de-indexed. Would like your views on this guys.
Intermediate & Advanced SEO | | ScottBaxterWW0