Blocked by meta-robots but there is no robots file
-
OK, I'm a little frustred here. I've waited a week for the next weekly index to take place after changing the privacy setting in a wordpress website so Google can index, but I still got the same problem. Blocked by meta-robots, no index, no follow. But I do not see a robot file anywhere and the privacy setting in this Wordpress site is set to allow search engines to index this site. Website is www.marketalert.ca
What am I missing here? Why can't I index the rest of the website and is there a faster way to test this rather than wait another week just to find out it didn't work again?
-
The .htaccess file is in placing directing www to non www, so I don't see what else I could do with that. I forgot to mention the website was recently overhauled by someone else, and they are having me help with SEO. Not sure if that has anything to do with it. It looks like the .htaccess should be reversed so the non www points to the www which has more value. Someone else designed this site and they are having me do the SEO on it for them.
-
The issue might be the forwarding from www.yourdomain.ca to yourdomain.ca
look at http://www.opensiteexplorer.org/pages?site=marketalert.ca%2F
and here http://www.opensiteexplorer.org/pages?site=www.marketalert.ca%2F
..some are indexed on with www and other without www. , this is your main issue.
recommendation:
- revisit the htaccess file or where the redirect has been set DNS..
- choose one with www or without and stick to it.
- revicit your external links and make the changes to your links
- create new sitemap and resubmit to SearchEngines
-
I ran the SEO web crawler and it finished already. Successfully crawled all pages. I still have to wait for another week to get the main campaign updated and see results there, but I believe it may work too now.
I guess I solved my own problem after being directed to robots.txt by Jim. I found that the Wordpress plugin for SEO xml sitemap creator was the problem because it created a virtual robots.txt file which sent me on a wild goose chase looking for a robots.txt file which didn't exist. Creating a robots.txt file allowing all seems to be the solultion, incase anyone else has this same problem.
-
If you can, follow up either way - happy to help you get it debugged!
-
I was able to update my sitemap.xml with Google webmaster tools no problem. I'm not 100% confident though that means the entire site is searchable by the spiders. I guess I'll know for sure in a few days tops.
-
I agree with Jim. Update your sitemap.xml files with Google Webmaster Tools. That will also help you identify problems you might be missing.
-
I've done some more looking into it and seems to be a problem when Wordpress uses the XML site generator plugin. It creates a virtual robot.txt file, which is why I couldn't find the robot.txt file. Apparently the only fix is to replace it with an actual robot.txt file forcing it to allow all.
I just replaced the robots.txt file with a real one allowing all. SEOmoz estimates a few days to test site crawl and it's another 7 days before the next scheduled crawl. I'd kinda like to find out sooner if it's not going to work. There must be a faster test. I don't need a detailed test, just a basic test that says, YEP, we can see this many pages or something like that.
-
hi
your robots.txt file is located here http://marketalert.ca/robots.txt, which is the root of your website directory.
this is the actual location of your sitemap file (http://marketalert.ca/sitemap.xml), does the Google WT show any issues about the sitemap file could not be found?
You might need to resubmit the sitemap file, if there are any changes, of course with the updated version of your site.
hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question about Image Optimization and File Size -- Does it really matter
I was using Moz's guidelines (https://moz.com/learn/seo/page-speed) to reduce the file size of my pages to improve load speed, but I'm not sure it really makes much of a difference. On this page https://www.mtecorp.com/cad/MAPP0006A002/, the file size is 710KB and it's Google Page Insight Score is 84 on desktop and 66 on mobile. On this page https://www.mtecorp.com/cad/SWNW0130E/, I got the image size to 227KB and its Google Page insight Score is virtually the same, 87 on desktop and 62 on mobile. Any ideas if it is really worth the time to get images down? (or maybe it doesn't matter if it is less than 1,000KB.
Technical SEO | | EricVallee1 -
Robots.txt Tester - syntax not understood
I've looked in the robots.txt Tester and I can see 3 warnings: There is a 'syntax not understood' warning for each of these. XML Sitemaps:
Technical SEO | | JamesHancocks1
https://www.pkeducation.co.uk/post-sitemap.xml
https://www.pkeducation.co.uk/sitemap_index.xml How do I fix or reformat these to remove the warnings? Many thanks in advance.
Jim0 -
Problems with Meta Title on Bing
On the Bing search engine, it isn't showing the actual meta title we have for a website. It's showing something different. However, the correct meta title is showing on the Google search engine. Has anyone had the same issue? Has anyone been able to fix this issue? Thanks for your help!
Technical SEO | | Harrison.Stickboy0 -
When do I change out my meta tags after a full website revamp?
We're creating a new version of our entire website - look and feel is completely different, though core functionality and results are the same. Just cleaner, faster.. etc. We're doing a temp redirect to the temporary url for testing and to slow roll the release to some of our users for a more friendly approach. Eventually, the new look and feel will be under the original url. I've researched best practices for the site transfer, including "make sure the meta tags for title and description are exactly the same". The concern I have is that Moz Analytics is detecting a lot of errors in the existing meta tags. They're too long, have changed and become inconsistent after being passed through different hands, & some have some keyword stuffing in there. I have plans to change them out and really clean them up... I'm just wondering, when is the best time to do that? Since the tags are bad, should I just do it now but make sure that the old and new are matching? Or should I wait (and for how long?) after the new site is switched over and everything is on the original URL?
Technical SEO | | SFMoz0 -
Blocking https from being crawled
I have an ecommerce site where https is being crawled for some pages. Wondering if the below solution will fix the issue www.example.com will be my domain In the nav there is a login page www.example.com/login which is redirecting to the https://www.example.com/login If I just disallowed /login in the robots file wouldn't it not follow the redirect and index that stuff? The redirect part is what I am questioning.
Technical SEO | | Sean_Dawes0 -
How to block google robots from a subdomain
I have a subdomain that lets me preview the changes I put on my site. The live site URL is www.site.com, working preview version is www.site.edit.com The contents on both are almost identical I want to block the preview version (www.site.edit.com) from Google Robots, so that they don't penalize me for duplicated content. Is it the right way to do it: User-Agent: * Disallow: .edit.com/*
Technical SEO | | Alexey_mindvalley0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0 -
Site links -> anchor text and blocking
1.Does anyone know where google pulls that anchor text for the organic site links? -- Is there a way to manipulate the anchor text of the sitelinks to get our more important pages to stick out more (capitalization, punctuation etc.) 2. If i block a few of my sitelinks from showing will goolge replace it with a new sitelink or will i be left with fewer? Thanks! Srs
Technical SEO | | Morris770