Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Tool to search relative vs absolute internal links
-
I'm preparing for a site migration from a .co.uk to a .com and I want to ensure all internal links are updated to point to the new primary domain.
What tool can I use to check internal links as some are relative and others are absolute so I need to update them all to relative.
-
Thanks for the replies, I ended up getting a techie to run a script through the site for me which gave me all the info I needed. None of the tools mentioned did exactly what I was looking for.
-
That tool that Matt mentioned looked interesting, but it would have been painful to have to go through your site one page at a time.
As usual for crawling tasks like this, the paid version of Screaming Frog will do what you want. You can tell it to crawl your site looking for **href="yoursite.com **to find all occurrences of absolute internal links. You'd have to do a bit of regex magic to get it to find the relative links, but since by their nature a relative link will work even with the domain change, not sure why you'd be looking for those.
Or you could just do a find and replace of the URL string using something like phpMyAdmin directly in your database. That would be fastest as it would find & replace in one go, instead of having to manually edit each page.
Is this a WordPress site, there's a plugin specifically for finding and automatically updating these links. (It basically automates and puts a UI on the phpMyAdmin process mentioned above.)
Any of those ideas help?
Paul
-
Any chance anyone knows any other tools I can use to crawl a site and give me a report of absolute and relative internal links?
-
Thanks for the reply although I've checked that add-on and it's not available for download anymore. Any chance you can send me the local files? I've mailed the admin but haven't got a reply yet.
Unless anyone knows of any other tools?
-
I'll give you the best answer I can but at least consider the possibility that absolute URLs are actually better long term. Other than moving a site around as you're doing now, absolute URLs win on every factor.
That said, you're looking for FireLinkReport.
http://www.searchenginejournal.com/firelinkreport-research-on-page-links-firefox/17714/
It's a FFox add on that does internal vs. external, absolute vs. relative, etc. and this should create a report that helps you do what you need.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Googlebot and other spiders are searching for odd links in our website trying to understand why, and what to do about it.
I recently began work on an existing Wordpress website that was revamped about 3 months ago. https://thedoctorwithin.com. I'm a bit new to Wordpress, so I thought I should reach out to some of the experts in the community.Checking ‘Not found’ Crawl Errors in Google Search Console, I notice many irrelevant links that are not present in the website, nor the database, as near as I can tell. When checking the source of these irrelevant links, I notice they’re all generated from various pages in the site, as well as non-existing pages, allegedly in the site, even though these pages have never existed. For instance: https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/feedback-and-testimonials/ allegedly linked from: https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/ (doesn’t exist) In other cases, these goofy URLs are even linked from the sitemap. BTW - all the URLs in the sitemap are valid URLs. Currently, the site has a flat structure. Nearly all the content is merely URL/content/ without further breakdown (or subdirectories). Previous site versions had a more varied page organization, but what I'm seeing doesn't seem to reflect the current page organization, nor the previous page organization. Had a similar issue, due to use of Divi's search feature. Ended up with some pretty deep non-existent links branching off of /search/, such as: https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/consultations/ allegedly linked from: https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/ (doesn't exist). I blocked the /search/ branches via robots.txt. No real loss, since neither /search/ nor any of its subdirectories are valid. There are numerous pre-existing categories and tags on the site. The categories and tags aren't used as pages. I suspect Google, (and other engines,) might be creating arbitrary paths from these. Looking through the site’s 404 errors, I’m seeing the same behavior from Bing, Moz and other spiders, as well. I suppose I could use Search Console to remove URL/category/ and URL/tag/. I suppose I could do the same, in regards to other legitimate spiders / search engines. Perhaps it would be better to use Mod Rewrite to lead spiders to pages that actually do exist. Looking forward to suggestions about best way to deal with these errant searches. Also curious to learn about why these are occurring. Thank you.
Technical SEO | | linkjuiced0 -
Image Search
Hello Community, I have been reading and researching about image search and trying to find patterns within the results but unfortunately I could not get to a conclusion on 2 matters. Hopefully this community would have the answers I am searching for. 1) Watermarked Images (To remove or not to remove watermark from photos) I see a lot of confusion on this subject and am pretty much confused myself. Although it might be true that watermarked photos do not cause a punishment, it sure does not seem to help. At least in my industry and on a bunch of different random queries I have made, watermarked images are hard to come by on Google's images results. Usually the first results do not have any watermarks. I have read online that Google takes into account user behavior and most users prefer images with no watermark. But again, it is something "I have read online" so I don't have any proof. I would love to have further clarification and, if possible, a definite guide on how to improve my image results. 2) Multiple nested folders (Folder depth) Due to speed concerns our tech guys are using 1 image per folder and created a convoluted folder structure where the photos are actually 9 levels deep. Most of our competition and many small Wordpress blogs outrank us on Google images and on ALL INSTANCES I have checked, their photos are 3, 4 or 5 levels deep. Never inside 9 nested folders.
Technical SEO | | Koki.Mourao
So... A) Should I consider removing the watermark - which is not that intrusive but is visible?
B) Should I try to simplify the folder structure for my photos? Thank you0 -
Is there a tool to see all redirects?
I'm thinking this is a silly question, but I've never had to deal with it I thought I'd ask. Ok is there a tool out there that will show all the redirects to a domain. I'm working on a project that I keep stumbling on urls that redirect to the site I'm studying. They don't show up in Open Site or ahrefs as linking domains, but they keep popping up on me. Any thoughts?
Technical SEO | | BCutrer0 -
Links from Instructables.com?
This is a silly newbie question. But will posting on www.instructables.com with some valuable content and url link back to my site help with "linking"? Or do they put a no-follow on all links on their site? Thanks for answering! Ron
Technical SEO | | yatesandcojewelers0 -
Value of an embedded site vs. a direct link?
We have a new site that is a great resource for a serious subject (suicide). I have been getting many requests from various communities and clinics about help on embedding our site in their websites. Although I certainly don't want to keep this resource from being used as much as possible, I am curious about the SEO costs/benefit to having someone embed our site on their own website rather than provide a link to our website directly from theirs.
Technical SEO | | ron_adease1 -
Internal search : rel=canonical vs noindex vs robots.txt
Hi everyone, I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem. The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!! I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too... The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this) Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless... Is there somebody who can tell me which option is the best to keep this traffic ? Thanks a million
Technical SEO | | JohannCR0 -
Is link cloaking bad?
I have a couple of affiliate gaming sites and have been cloaking the links, the reason I do this is to stop have so many external links on my sites. In the robot.txt I tell the bots not to index my cloaked links. Is this bad, or doesnt it really matter? Thanks for your help.
Technical SEO | | jwdesign0