Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Truncated sitelinks in SERP
Anyway, I've seen these pop up for certain searches (not all). I'm curious as to how these truncated sitelinks appear. When searching for a company, the full sitelinks appear with descriptions and the like, but these little ones are a little new to me (apologies, I don't know if they have an official name). Is there a way to get them to appear for our page? Or more importantly, does it matter? iZlJZ
Intermediate & Advanced SEO | | Parker8180 -
WordPress posts Title field inserts title into blog posts like a headline but doesn't ad H1 tag how to change?
I have a Wordpress website which is just using the Default theme, when I post in the blog, whatever I put in the "Title" field at the top of the editor is automatically is placed within the body of the blog post, like a headline, but it doesn't include any H1 tags that I can see. If I add my own headline within in the blog editor, it still inserts the Title like a headline. I am using the Yoast SEO Plugin and also write the meta title there, should I just leave the Wordpress title field blank so it doesn't insert into the blog post? Or is that inserted Title being recognized as an H1 even though I don't see h1 tags anywhere? Hope this isn't too confusing.
Intermediate & Advanced SEO | | SEO4leagalPA1 -
Visibility for https://goo.gl/gJH7eh
Hi Mozzers, I am wondering if anyone can help me with the following. At the start of May this year we really lost visibility for the homepage of this site https://goo.gl/gJH7eh. This was particularly noticeable by tracking rankings for the term 'oak furniture'. We previously ranked on page 1 for the term 'oak furniture', but since May the homepage has struggled to make the top 100 positions for this term. We're confident that we have done everything within Google's guidelines, but it seems something is really holding the homepage back. The site ranks on page 1 for 'oak furniture' on Bing. The site had previously had a manual penalty for unnatural links (warning received several years ago). These links had a particular emphasis on using the anchor text 'oak furniture'. When we took over the site we did an extensive link clean up and disavow and managed to get the penalty removed at the end of October 2013. Any help would be greatly appreciated. Karen
Intermediate & Advanced SEO | | OFS0 -
How correcttly redirect to http://m.mobile.com website
Hi everyone, I will appreciate if you will drop here a piece of script ( or link to ) for CORRECT redirection for our http://m.mobile.com website. We are confused what type of redirection should we use java script, htaccess, php, 301, 302....? in order not to damage any rankings and etc... Thanks
Intermediate & Advanced SEO | | Webdeal
webdeal0 -
Stopped ranking. Suspect links with keyword in blog posts and blog username. What to do? Disvow?
One of our staff thought it was a good idea to comment in 30 blogs in our niche using "keyword" as username in blog post linked to our website and additionally adding links to our website in the posts. We now got caught by panda or penguin (google confirmed no manual penalty was taken) and not ranking anymore for this keyword. No notification in webmaster tools neither. We have links from around 90 root domains of which 30 are from these blog posts. What would you suggest to do? Just building more legitimate links so that share of bad links goes down?
Intermediate & Advanced SEO | | lcourse
Using google disvow tool? We would then loose potential to get later legitimate links from these sites? Any ideas/suggestions?0 -
How accurate are the index figures in GWT?
I've been looking at a site in GWT and the number of indexed urls is very low when compared with the number or submitted urls on the xml sitemaps. The site has several stores which are all submitted using different sitemaps. When you perform a search in Google, eg site:domain.com/store1 site:domain.com/store2 site:domain.com/store3 The results are similar to the webmaster urls. However, looking in the analytics for landing pages used for organic traffic from Google shows a much higher number of pages. If these pages aren't indexed as reported in GMT, how could they be found in the results and be recorded as landing pages?
Intermediate & Advanced SEO | | edwardlewis0 -
Indexed non existent pages, problem appeared after we 301d the url/index to the url.
I recently read that if a site has 2 pages that are live such as: http://www.url.com/index and http://www.url.com/ will come up as duplicate if they are both live... I read that it's best to 301 redirect the http://www.url.com/index and http://www.url.com/. I read that this helps avoid duplicate content and keep all the link juice on one page. We did the 301 for one of our clients and we got about 20,000 errors that did not exist. The errors are of pages that are indexed but do not exist on the server. We are assuming that these indexed (nonexistent) pages are somehow linked to the http://www.url.com/index The links are showing 200 OK. We took off the 301 redirect from the http://www.url.com/index page however now we still have 2 exaact pages, www.url.com/index and http://www.url.com/. What is the best way to solve this issue?
Intermediate & Advanced SEO | | Bryan_Loconto0 -
Redirecting www.example.com to www.example.com/directory/
Hi All, There's been some internal debate going back and forth about redirecting the homepage of a site to a directory. There are a few different POVs circulating, one of which is that it's no different than redirecting to a /index page. Basically, the homepage is ranking for the keyword that we want the directory to rank for but I can't seem to justify placing this type of redirect. The content on both pages is different, but for the term both the homepage and the directory make sense to rank. Has anyone ever done anything like this before? Can anyone see any reason to do something like this? I believe this move would dilute the link value we currently have going to the homepage and potentially cause us to lose our #2 slot with the homepage in favor of a lower spot with the directory. I'd love to hear any thoughts on this/learn if anyone has experimented with this tactic. Thanks in advance!
Intermediate & Advanced SEO | | JamieCottle280