Drupal Question
-
So on our site we have a plugin for our fan gallery. The issue is that I am getting a lot of duplication errors and it's saying the URL is too long and all the errors are coming from the Fan Gallery, which has over 8,000 errors. It seems to be pulling a long form query URL that has over 100 characters. You can't physically see it on the site, but the crawlers can.
Anyway I'm trying to figure out a fix for this. One method would be to just stop those pages from being crawled, but I would hate to do that as the fan gallery for us would be a great source of links and content.
So I'm wondering if anyone else has had an issue with these types of plugins before where the user can upload a photo or do a video embed and then it submits to the site.
If you have a better method please let me know. I usually work on E-comm platforms so my experience with drupal is limited.
-
Well, we're using drupal also and had the same problem, we fixed it by making a custom view with some custom paths that we're smaller for example:
previously: http://www.domain.com/news/typeofnews/paperback/issue-20-august-2011/itemtitle1/
(as you can imagine the titles could be large)Now: http://www.domain.com/news/20-09-2012/title
Maybe this is possible for u also ?
-
Hi Kate,
Looking at the URL string there I see both Pressflow and Pantheon variables being passed. It looks like the platform is in the way. I would suggest sending an email to the folks at Pantheon and/or Pressflow to get some help. I'm not sure what your technical expertise is, but Pressflow is a flavor of Drupal and Pantheon is a hosting service for Drupal. They appear to be adding variables to the URL, which probably isn't necessary.
Just my guess.
John
-
Thanks,
Subfolders. I have a few URL's with over a 100 characters because of what things are named, but then it pulls up this really long query string like this:
URL/welcome-new-raywjcom?PRESSFLOW_SETTINGS=%7B%22conf%22%3A%7B%22pressflow_smart_start%22%3Atrue%2C%22pantheon_binding%22%3A%22e92472919be14d0b93b8d8ccd2e6b8c1%22%2C%22pantheon_site_uuid%22%3A%22da9acf76-5d3a-4fab-8c70-bb1e73cbe931%22%2C%22pantheon_environment%22%
and that's only a portion of it it keeps going on and on after that. Which is why I was thinking just to block it for now.
-
Is it the actual query that is over 100 characters or is it a long URL with lot's of subfolders that is causing it to be so long? If the latter, then maybe you should try to start over again and put the gallery closest to the root.
If this can't be done I would look into other plugins, Drupal can be buggy as heck with plugins and the issue might be isolated to that plugin. I would try to fix that and wouldn't consider blocking those pages as more than a temporary option to protect the site while this is being looked at.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Question: Is there any value for ecomm sites in having a reverse "breadcrumb" in the URL?
Wondering if there is any value for e-comm sites to feature a reverse breadcrumb like structure in the URL? For example: Example: https://www.grainger.com/category/anchor-bolts/anchors/fasteners/ecatalog/N-8j5?ssf=3&ssf=3 where we have a reverse categorization happening? with /level2-sub-cat/level1-sub-cat/category in the reverse order as to the actual location on the site. Category: Fasteners
Technical SEO | | ROI_DNA
Sub-Cat (level 1): Anchors
Sub-Cat (level 2): Anchor Bolts0 -
Migration to New Domain - 301 Redirect Questions
My client is migrating their site to a new domain. I just did a big redesign, including URL structure change, and 301s from old URLs to new URLs. Now they want a new name, so we're moving forward with a new domain name. However, we're going to keep the site on the current domain while we ease customers into the new name. During that time, I'm going to be building links to the new domain name and 301 Redirecting that new one to the current domain name. Then, once we migrate the site to the new domain name, I'm then going to redirect the current domain name to the new domain name. So, my question(s) is/are: Is the above process the best way to use 301 redirects to to build links to the new domain while we transition everything? Should I (or can I) do 3 redirects from the oldest URLs, to the current URLs then to the new URLs? General question... I can't seem to find this anywhere online, but what is the best practice for what order URLs should be in in the htaccess file? Thanks!
Technical SEO | | Kenny-King0 -
Google PR Rank Question (s)
Hi
Technical SEO | | damientown
My Google PR rank is still 1/10 (www.abouttownmarketing.com) after 12 weeks of daily SEO work building what I thought were quality back links. Does anyone know how often Google updates its PR rank? Also is it a linear measure or link adwords quality score is it exponential?
Many thanks
Damien0 -
Is there an easy solution for duplicate page content on a drupal CMS?
I have a drupal 7 site www.australiacounselling.com.au that has over 5000 crawl errors (!). The main problem - close to 3000 errors- is I have duplicate page content. When I create a page I can create a URL alias for the page that is SEO friendly, however every time I do this, it is registering there are 2 pages with the same content. Is there a module that you're aware of that I can have installed that would allow me to show what is the canonical page? My developers seemed stumped and have given up trying to find a solution, but I'm not convinced that it should be that hard. Any ideas from those familiar with drupal 7 would be greatly appreciated!
Technical SEO | | ClintonP0 -
Robots.txt Question
In the past, I had blocked a section of my site (i.e. domain.com/store/) by placing the following in my robots.txt file: "Disallow: /store/" Now, I would like the store to be indexed and included in the search results. I have removed the "Disallow: /store/" from the robots.txt file, but approximately one week later a Google search for the URL produces the following meta description in the search results: "A description for this result is not available because of this site's robots.txt – learn more" Is there anything else I need to do to speed up the process of getting this section of the site indexed?
Technical SEO | | davidangotti0 -
Google Knowledge Graph related question
I have a client who is facing age discrimination in the film industry. (Big surprise there.) The problem is, when you type in his name, Google's new Knowledge Graph displays a brief bio about him to the right of the search results. This bio snippet includes his year of birth. Wikipedia is credited as the source for the bio information about him, and yet, his Wikipedia entry doesn't include his age or birth date. Neither does his iMDb bio. So the question is, How can he figure out where Google is getting that birthdate from? He wants to try and remove it, not falsify it. Thanks for any help you can offer.
Technical SEO | | JamesAMartin0 -
Sub-domains for keyword targeting? (specific example question)
Hey everyone, I have a question I believe is interesting and may help others as well. Our competitor heavily (over 100-200) uses sub-domains to rank in the search engines... and is doing quite well. What's strange, however, is that all of these sub-domains are just archives -- they're 100% duplicate content! An example can be seen here where they just have a bunch of relevant posts archived with excerpts. How is this ranking so well? Many of them are top 5 for keywords in the 100k+ range. In fact their #1 source of traffic is SEO for many of the pages. As an added question: is this effective if you were to actually have a quality/non-duplicate page? Thanks! Loving this community.
Technical SEO | | naturalsociety0 -
Robots.txt questions...
All, My site is rather complicated, but I will try to break down my question as simply as possible. I have a robots.txt document in the root level of my site to disallow robot access to /_system/, my CMS. This looks like this: # /robots.txt file for http://webcrawler.com/
Technical SEO | | Horizon
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/ I have another robots.txt file in another level down, which is my holiday database - www.mysite.com/holiday-database/ - this is to disallow access to /holiday-database/ControlPanel/, my database CMS. This looks like this: **User-agent: ***
Disallow: /ControlPanel/ Am I correct in thinking that this file must also be in the root level, and not in the /holiday-database/ level? If so, should my new robots.txt file look like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/ Or, like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /ControlPanel/ Thanks in advance. Matt0