Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Regex in Disavow Files?
-
Hi,
Will Regex expressions work in a disavow file?
If i include website.com/* will that work or would you recommend just website.com?
Thanks.
-
Hi Fubra,
You can disavow at a domain level, so no regex is required (and I don't think it will work).
Just add "domain:" before the domain, eg. domain:spammysite.com
Marie Haynes wrote a good guide to using the disavow tool here if you need any further information: https://moz.com/blog/guide-to-googles-disavow-tool
Cheers,
David
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Many Links to Disavow at Once When Link Profile is Very Spammy?
We are using link detox (Link Research Tools) to evaluate our domain for bad links. We ran a Domain-wide Link Detox Risk report. The reports showed a "High Domain DETOX RISK" with the following results: -42% (292) of backlinks with a high or above average detox risk
Intermediate & Advanced SEO | | Kingalan1
-8% (52) of backlinks with an average of below above average detox risk
-12% (81) of backlinks with a low or very low detox risk
-38% (264) of backlinks were reported as disavowed. This look like a pretty bad link profile. Additionally, more than 500 of the 689 backlinks are "404 Not Found", "403 Forbidden", "410 Gone", "503 Service Unavailable". Is it safe to disavow these? Could Google be penalizing us for them> I would like to disavow the bad links, however my concern is that there are so few good links that removing bad links will kill link juice and really damage our ranking and traffic. The site still ranks for terms that are not very competitive. We receive about 230 organic visits a week. Assuming we need to disavow about 292 links, would it be safer to disavow 25 per month while we are building new links so we do not radically shift the link profile all at once? Also, many of the bad links are 404 errors or page not found errors. Would it be OK to run a disavow of these all at once? Any risk to that? Would we be better just to build links and leave the bad links ups? Alternatively, would disavowing the bad links potentially help our traffic? It just seems risky because the overwhelming majority of links are bad.0 -
Default Wordpress 301 Redirects of JS and CSS files. Bad for SEO & How to Fix?
Hi there: We are developers with some digital marketing expertise, but a current issue has us perplexed. An outside SEO firm has asked us to clean up a large number of 301 redirects. Most of these are 'default' Wordpress behavior that relate to calling the latest version of a JS or CSS file. For instance, a JS file is called with this: https://websitexyz.com/wp-includes/js/wp-embed.min.js?ver=4.9.1 but ultimately redirects to this: https://websitexyz.com/wp-includes/js/wp-embed.min.js. We are being asked to prevent the redirect from happening by, presumably, calling the ultimate file to begin with. The issue is that, as far as we know, there's no easy way to alter WP behavior to call the ultimate file to begin with. Does anyone have any thoughts on this? Thanks.
Intermediate & Advanced SEO | | Daaveey0 -
Hacked website - Dealing with 301 redirects and a large .htaccess file
One of my client's websites was recently hacked and I've been dealing with the after effects of it. The website is now clean of malware and I already appealed to Google about the malware issue. The current issue I have is dealing with the 20, 000+ crawl errors which are garbage links that were created from the hacking. How does one go about dealing with all the 301 redirects I need to create for all the 404 crawl errors? I'm already noticing an increased load time on the website due to having a rather large .htaccess file with a couple thousand 301 redirects done already which I fear will result in my client's website performance and SEO performance taking a hit as well.
Intermediate & Advanced SEO | | FPK0 -
Large robots.txt file
We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately) Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down. Does anybody have any experience with a robots.txt of that size?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Partial Match or RegEx in Search Console's URL Parameters Tool?
So I currently have approximately 1000 of these URLs indexed, when I only want roughly 100 of them. Let's say the URL is www.example.com/page.php?par1=ABC123=&par2=DEF456=&par3=GHI789= All the indexed URLs follow that same kinda format, but I only want to index the URLs that have a par1 of ABC (but that could be ABC123 or ABC456 or whatever). Using URL Parameters tool in Search Console, I can ask Googlebot to only crawl URLs with a specific value. But is there any way to get a partial match, using regex maybe? Am I wasting my time with Search Console, and should I just disallow any page.php without par1=ABC in robots.txt?
Intermediate & Advanced SEO | | Ria_0 -
Do you add 404 page into robot file or just add no index tag?
Hi, got different opinion on this so i wanted to double check with your comment is. We've got /404.html page and I was wondering if you would add this page to robot text so it wouldn't be indexed or would you just add no index tag? What would be the best approach? Thanks!
Intermediate & Advanced SEO | | Rubix0 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
Using 2 wildcards in the robots.txt file
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string. So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string? Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on? Thanks.
Intermediate & Advanced SEO | | seo1234560