Can Search Engines Read "incorrect" urls?
-
I know that ideally a url should be something of the nature domain.com/topic, but if the url contains additional characters, for example, domain.com/topic?keyword, can the search engines still understand the complete words in the domain? Even though there are additional "incorrect" characters? Or do they stop "reading" once they find odd characters?
Thanks!
-
A few other things to note for having parameters in URLs:
- In Google Webmaster Tools and Bing Webmaster Tools, you can instruct the search engines to ignore certain parameters, so that they'll treat domain.com/topic?keyword and domain.com/topic as the same page (if ?keyword doesn't change the page content)
- You can also place the rel=canonical element on pages. So you could set domain.com/topic?keyword to rel canonical to domain.com/topic to pass its pagerank along.
-
Search engines will read all your parameters unless you tell google with webmaster tools what parameters to ignore. This can cause an issue with the url like domain.com/topic?keyword&somefield then pages that include the keyword and other parameters will share the link juice. So, if you have 10 options of somefield you will get ~1/10 value per page indexed.
So, it is better for you to use rewrites to include your keyword in the url and then mark parameters to not be indexed in Goggle etc.
-
Search engines can read most characters in a URL string, but specifically & generally refers to a variable in a script which doesn't typically have much valuable information regarding what a page may be about. Sometimes those variables may be the topic of a category of a shopping cart, so I have to imagine that information could be taken into account, but for long urls like the following it is hard to believe everything is factored into the URL's relevance to the keyword: http://www.google.com/search?q=long+url+string&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
Search engines index the whole URL and if there is keyword rich content that can definitely help, both from having the keyword bolded in the snippet (CTR WIN!) and a possible bump in the page's relevance to the keyword.
-
In general search engines are able to identify keywords in the URL even if they are i.e. a parameter that follows a "?" or other non-alphanumeric character. They might not treat it as an equally strong signal as when the keyword is a part of the file name, subdomain or domain name though. Hope that answers your question.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spam URL'S in search results
We built a new website for a client. When I do 'site:clientswebsite.com' in Google it shows some of the real, recently submitted pages. But it also shows many pages of spam url results, like this 'clientswebsite.com/gockumamaso/22753.htm' - all of which then go to the sites 404 page. They have page titles and meta descriptions in Chinese or Japanese too. Some of the urls are of real pages, and link to the correct page, despite having the same Chinese page titles and descriptions in the SERPS. When I went to remove all the spammy urls in Search Console (it only allowed me to temporarily hide them), a whole load of new ones popped up in the SERPS after a day or two. The site files itself are all fine, with no errors in the server logs. All the usual stuff...robots.txt, sitemap etc seems ok and the proper pages have all been requested for indexing and are slowly appearing. The spammy ones continue though. What is going on and how can I fix it?
Technical SEO | | Digital-Murph0 -
"Moz encountered an error on one or more pages on your site" Error
I have been receiving this error for a while: "Moz encountered an error on one or more pages on your site" It's a Multi-Lingual Wordpress website, the robots.txt is set to allow crawlers on all links and I have followed the same process for other website I've done yet I'm receiving this error for this site.
Technical SEO | | JustinZimri0 -
Disavowing the "right" bad backlinks
Hello, From july to november (this year), I gained 110.000 backlinks. Considering that I'm having trouble ranking well for any keyword in my niche (a niche that I was ranking #1 for several keywords and now I'm losing), I'm starting to believe that negative seo is affecting me. I already read several articles about negative seo, some telling this is a myth, others telling that negative SEO is alive and kicking... My site is about health and fitness in brazilian-portuguese language, and there's polish/chinese/english with warez/viagra/others drugs pointing to my domain and a massive links in comments with blogs without comment approval. Considering that all these new backlinks are not on my language and are clearly irrelevant, can I disavow them without fear of affecting my SEO even more ? Everytime you see someone talking about the disavow tool, is always the same warning: "cautiong when disavowing a link, you can hurt you site even more, removing a link that - in some way - was helping you". Any help or guidelines if I can remove this links safely would be greatly appreciated. Thank you and sorry for my english (it's not my native language) 5ZDjUcK.jpg
Technical SEO | | broncobr0 -
Rel="canonical"
Hello guys, By fixing the duplicate meta description issues of my site I noticed something a bit weird.The pages are product pages and the product on each one of them is the same and the only difference is the length of the product. On each page there is a canonical tag, and the link within the tag points to the same page. www.example.com/Product/example/2001 <rel="canonical" href="www.example.com/Product/example/2001"></rel="canonical"> This happens on every other page. I read twice and I think I will do it again the post on GWT and I think that is wrong as it should point to a different url, which is www.example.com/ProductGroup/example/ which is the the page where all the product are grouped together. Cheers
Technical SEO | | PremioOscar0 -
Best use of robots.txt for "garbage" links from Joomla!
I recently started out on Seomoz and is trying to make some cleanup according to the campaign report i received. One of my biggest gripes is the point of "Dublicate Page Content". Right now im having over 200 pages with dublicate page content. Now.. This is triggerede because Seomoz have snagged up auto generated links from my site. My site has a "send to freind" feature, and every time someone wants to send a article or a product to a friend via email a pop-up appears. Now it seems like the pop-up pages has been snagged by the seomoz spider,however these pages is something i would never want to index in Google. So i just want to get rid of them. Now to my question I guess the best solution is to make a general rule via robots.txt, so that these pages is not indexed and considered by google at all. But, how do i do this? what should my syntax be? A lof of the links looks like this, but has different id numbers according to the product that is being send: http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167 I guess i need a rule that grabs the following and makes google ignore links that contains this: view=send_friend
Technical SEO | | teleman0 -
Why won't the Moz plug in "Analyze Page" tool read data on a Big Commerce site?
We love our new Big Commerce site, just curious as to what the hang up is.
Technical SEO | | spalmer0 -
Google & async="true"
Hello, Any idea if Google (or Bing) parses/indexes content from scripts that are loaded using the async="true" attribute? In other words, is asynchronously loaded content indexable? Thank you.
Technical SEO | | phaistonian0 -
How do search engines treat urls that end in hashtags?
How do search engines treat urls that end in hashtags? For example, www.domain.com/abc#xyz.
Technical SEO | | nicole.healthline0