Robots.txt
-
My campaign hse24 (www.hse24.de) is not being crawled any more ...
Do you think this can be a problem of the robots.txt?
I always thought that Google and friends are interpretating the file correct, seen that he site was crawled since last week.
Thanks a lot
Bernd
NB: Here is the robots.txt:
User-Agent: * Disallow: / User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: MSNBot User-agent: Slurp User-agent: yahoo-mmcrawler User-agent: psbot Disallow: /is-bin/ Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_DisplayProductInformation-Start Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/ Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/Root%20Edition/units/HSE24/Beratung/
-
Hallo Bernd,
Of course, I agree with everyone else that you need to fix your robots.txt file.
However I'd also add the suggestion that you setup Google Webmaster Tools for your site. These will help inform you about crawl errors and your robots.txt file and might be helpful for you in future.
Also whilst having a quick look at your site I noticed some duplicate page title issues. Make sure you are tracking your site with SEOmoz's campaign tool. It will really help you find these types of issues.
Viel Glück!
-
Yep, You just made your site invisible! >.<
Personally I just disallow areas I don't want indexing and let all bots crawl.
User-Agent: *
Disallow: /whatever I don't want indexed
-
User-Agent: * Disallow: /
That is blocking every bot from crawling anything.
User-Agent: * = every robot Disallow: / = every directory
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
What are the negative implications of listing URLs in a sitemap that are then blocked in the robots.txt?
In running a crawl of a client's site I can see several URLs listed in the sitemap that are then blocked in the robots.txt file. Other than perhaps using up crawl budget, are there any other negative implications?
Technical SEO | | richdan0 -
Robots.txt | any SEO advantage to having one vs not having one?
Neither of my sites has a robots.txt file. I guess I have never been bothered by any particular bot enough to exclude it. Is there any SEO advantage to having one anyways?
Technical SEO | | GregB1230 -
Sub Domains and Robot.txt files...
This is going to seem like a stupid question, and perhaps it is but I am pulling out what little hair I have left. I have a sub level domain on which a website sits. The Main domain has a robots.txt file that disallows all robots. It has been two weeks, I submitted the sitemap through webmaster tools and still, Google has not indexed the sub domain website. My question is, could the robots.txt file on the main domain be affecting the crawlability of the website on the sub domain? I wouldn't have thought so but I can find nothing else. Thanks in advance.
Technical SEO | | Vizergy0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
Robots.txt query
Quick question, if this appears in a clients robots.txt file, what does it mean? Disallow: /*/_/ Does it mean no pages can be indexed? I have checked and there are no pages in the index but it's a new site too so not sure if this is the problem. Thanks Karen
Technical SEO | | Karen_Dauncey0 -
Severe rank drop due to overwritten robots.txt
Hi, Last week we made a change to drupal core for an update to our website. We accidentally overwrote our good robots.txt that blocked hundreds of pages with the default drupal robots.txt. Several hours after that happened (and we didn't catch the mistake) our rankings dropped from mostly first, second place in Google organic to bottom and mid first page. Basically I believe we flooded the index with very low quality pages at once and threw a red flag and we got de-ranked. We have since fixed the robots.txt and have been re-crawled but have not seen a return in rank. Would this be a safe assumption of what happened? I haven't seen any other sites getting hit in the retail vertical yet in regards to any Panda 2.3 type of update. Will we see a return in our results anytime soon? Thanks, Justin
Technical SEO | | BrettKrasnove0 -
Robots.txt and robots meta
I have an odd situation. I have a CMS that has a global robots.txt which has the generic User-Agent: *
Technical SEO | | Highland
Allow: / I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?0