Drupal Question
-
So on our site we have a plugin for our fan gallery. The issue is that I am getting a lot of duplication errors and it's saying the URL is too long and all the errors are coming from the Fan Gallery, which has over 8,000 errors. It seems to be pulling a long form query URL that has over 100 characters. You can't physically see it on the site, but the crawlers can.
Anyway I'm trying to figure out a fix for this. One method would be to just stop those pages from being crawled, but I would hate to do that as the fan gallery for us would be a great source of links and content.
So I'm wondering if anyone else has had an issue with these types of plugins before where the user can upload a photo or do a video embed and then it submits to the site.
If you have a better method please let me know. I usually work on E-comm platforms so my experience with drupal is limited.
-
Well, we're using drupal also and had the same problem, we fixed it by making a custom view with some custom paths that we're smaller for example:
previously: http://www.domain.com/news/typeofnews/paperback/issue-20-august-2011/itemtitle1/
(as you can imagine the titles could be large)Now: http://www.domain.com/news/20-09-2012/title
Maybe this is possible for u also ?
-
Hi Kate,
Looking at the URL string there I see both Pressflow and Pantheon variables being passed. It looks like the platform is in the way. I would suggest sending an email to the folks at Pantheon and/or Pressflow to get some help. I'm not sure what your technical expertise is, but Pressflow is a flavor of Drupal and Pantheon is a hosting service for Drupal. They appear to be adding variables to the URL, which probably isn't necessary.
Just my guess.
John
-
Thanks,
Subfolders. I have a few URL's with over a 100 characters because of what things are named, but then it pulls up this really long query string like this:
URL/welcome-new-raywjcom?PRESSFLOW_SETTINGS=%7B%22conf%22%3A%7B%22pressflow_smart_start%22%3Atrue%2C%22pantheon_binding%22%3A%22e92472919be14d0b93b8d8ccd2e6b8c1%22%2C%22pantheon_site_uuid%22%3A%22da9acf76-5d3a-4fab-8c70-bb1e73cbe931%22%2C%22pantheon_environment%22%
and that's only a portion of it it keeps going on and on after that. Which is why I was thinking just to block it for now.
-
Is it the actual query that is over 100 characters or is it a long URL with lot's of subfolders that is causing it to be so long? If the latter, then maybe you should try to start over again and put the gallery closest to the root.
If this can't be done I would look into other plugins, Drupal can be buggy as heck with plugins and the issue might be isolated to that plugin. I would try to fix that and wouldn't consider blocking those pages as more than a temporary option to protect the site while this is being looked at.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Forced Redirects/HTTP<>HTTPS 301 Question
Hi All, Sorry for what's about to be a long-ish question, but tl;dr: Has anyone else had experience with a 301 redirect at the server level between HTTP and HTTPS versions of a site in order to maintain accurate social media share counts? This is new to me and I'm wondering how common it is. I'm having issues with this forced redirect between HTTP/HTTPS as outlined below and am struggling to find any information that will help me to troubleshoot this or better understand the situation. If anyone has any recommendations for things to try or sources to read up on, I'd appreciate it. I'm especially concerned about any issues that this may be causing at the SEO level and the known-unknowns. A magazine I work for recently relaunched after switching platforms from Atavist to Newspack (which is run via WordPress). Since then, we've been having some issues with 301s, but they relate to new stories that are native to our new platform/CMS and have had zero URL changes. We've always used HTTPS. Basically, the preview for any post we make linking to the new site, including these new (non-migrated pages) on Facebook previews as a 301 in the title and with no image. This also overrides the social media metadata we set through Yoast Premium. I ran some of the links through the Facebook debugger and it appears that Facebook is reading these links to our site (using https) as redirects to http that then redirect to https. I was told by our tech support person on Newspack's team that this is intentional, so that Facebook will maintain accurate share counts versus separate share counts for http/https, however this forced redirect seems to be failing if we can't post our links with any metadata. (The only way to reliably fix is by adding a query parameter to each URL which, obviously, still gives us inaccurate share counts.) This is the first time I've encountered this intentional redirect thing and I've asked a few times for more information about how it's set up just for my own edification, but all I can get is that it’s something managed at the server level and is designed to prevent separate share counts for HTTP and HTTPS. Has anyone encountered this method before, and can anyone either explain it to me or point me in the direction of a resource where I can learn more about how it's configured as well as the pros and cons? I'm especially concerned about our SEO with this and how this may impact the way search engines read our site. So far, nothing's come up on scans, but I'd like to stay one step ahead of this. Thanks in advance!
Technical SEO | | ogiovetti0 -
Question on Google's Site: Search
A client currently has two domains with the same content on each. When I pull up a Cached version of the site, I noticed that it has a Cache of the correct page on it. However, when I do a site: in Google, I am seeing the domain that we don't want Google indexing. Is this a problem? There is no canonical tag and I'm not sure how Google knows to cache the correct website but it does. I'm assuming they have this set in webmaster tools? Any help is much appreciated! Thanks!
Technical SEO | | jeff_46mile0 -
Domains and Hosting Question
I bought hosting for unlimited domains on Godaddy. It's not a dedicated server. It was just $85 a year. I have unlimited latency but a limited amount of "space." I don't know a lot about hosting servers etc... My question is relatively simple. When I go in GoDaddy to my hosting. There is a site that shows up as hosted, and all of the other sites show up under that site in it's directory. If you type the name of the site I bought the hosted package on, then type a forward slash and the name of one of the other sites on the hosting package, you will actually go to the other website. What is this relationship? Is it normal? Does that make all of my websites subdomains of the main site (that I bought the hosting package on)? I don't fully comprehend how this effects everything...
Technical SEO | | JML11790 -
Another http vs https Question?
Is it better to keep the Transaction/ Payment pages on a commercial website as the only secure ones (https) and remainder of website as http? Or is it better to have all the commercial website as secure (https)?
Technical SEO | | sherohass0 -
Domain Crawl Question
We have our domain hosted by two providers - web.com for the root and godaddy for the subdomain. Why SEOMOZ is not picking up the total pages of the entire domain?
Technical SEO | | AppleCapitalGroup0 -
Blog question
If i set up a blog like this - http://www.abccompany.com/blog ? ( in a folder ), will each link to http://www.abccompany.com/blog carry more value to the main site than if the blog were set up like this- http://www.blog.abccompany.com
Technical SEO | | seoug_20050 -
Robots.txt question
I want to block spiders from specific specific part of website (say abc folder). In robots.txt, i have to write - User-agent: * Disallow: /abc/ Shall i have to insert the last slash. or will this do User-agent: * Disallow: /abc
Technical SEO | | seoug_20050 -
SEO MOZ technical questions
Hi there, I would be very grateful if you can provide me with an explanation to the following so I understand it better - what do these heading mean? Domain Authority: (out of 100) Domain MozRank: Domain MozTrust: Total Links: Ext. Followed Links: Linking Root Domains: Followed Linking Root Domains: Linking C-Blocks: Thanks very much guys, much apprciated. Thanks Gareth
Technical SEO | | GAZ090