Tag Archives: google

Not Taking Friends With You – or how Facebook and Other Social Networks Ignore Redirects

facebook-thumbs-down

From a developer perspective, Facebook really p**ses me off. It seems that things change quite a bit and as my time isn’t focused 100% on Facebook stuff (I would go far to say that I’m a casual developer these days), I get surprised when something changes – I mean, how hard can it be to mail registered Facebook developers (or just those that have created apps) to keep us informed as things change?

Anyhow, the above was just a little rant to give you a feel for my mood when it comes to Facebook. It is however a different Facebook annoyance that I’d like to forewarn people about as well as ask Facebook whether they would consider a change that I (and probably many others before me) propose.

Shares Do Not Come Across in a Site Migration

I recently worked on one of my weekend projects (a website) where due to a navigation change in the site, a large number of pages changed their URL as we reflected the navigation change in those URLs. Naturally, it goes without saying that I set up the 301 redirects – job done I thought.

Not so fast is what I should have told myself but that is the power of hindsight as they say. What is lost in such a transition where the URL changes is the likes, tweets, and shares in general. This is particularly frustrating when you realise some time later as I did.

The current solution to this problem is that you need to keep note of that historical URL and provide the share buttons this information instead of the current URL. Annoyingly, like in my situation, you’ll need to sync this logic with the date that the URL change was made. This means you may have pages which collate share counts on the legacy URL and newer pages, which use the same layout/templates will collate share counts on the new URL.

You can provide the Twitter Tweet button a separate URL to which the tweet count is aggregated. This is the value of the data-counturl, which is useful although it doesn’t truly address my issue (keep reading).

Share buttons from Facebook, LinkedIn, and Xing all provide a means to specify a single URL to which the share counts will be added. Therefore, the solution here is simply to use that legacy URL again instead of the current or new one.

At the time of writing, this solution worked for all but Facebook – I’ve implemented the change but waiting to see if those historical counts come back. I’m however rapidly losing hope as Facebook seem to sometimes pick things up but often, also drop the ball.

Doesn’t Feel Right

Either way, the solution to this problem simply doesn’t sit right with me. I was relatively fortunate in that for the site in question, I was using a CMS that allowed me to define a date for when the URL change occurred and then using a little PHP logic, define which pages should use the legacy URL for sharing and which should simply use the current. Many people do not have this luxury, especially if they are using online services such as wordpress.com and then change to using their own domain etc. (disclaimer – this may actually be a bad example. I can imagine that if anyone offers a service that tracks this domain change and ensures social sharing plugins honour the legacy URL, it would be the WordPress guys).

Praise to Google+

Now, I’m no particular ‘fanboy’ of any device or service (aside: it is actually the fanboy culture around Macbook Pros that has hindered me buying what is clearly a very good product), but in this instance Google+ didn’t need any corrective action – it just worked after the change. How refreshing I thought but why only Google?

Let us imagine the scenario when a user clicks on the relevant share button on a given page, which formerly had another URL. At that point in time, that sharing service has no ability to know that the new URL being shared is related to the older one and therefore should aggregate the share counts. This leads to the ‘solution’ above where we as site owners have to provide this information either as a form of link between the legacy and new URLs, such as in the case of the Tweet button where you can provide both the data-url and data-counturl attributes, or simply maintain the choice of using the legacy link regardless as per the other examples above.

That may all seem fair enough as those poor (read ironically as ‘huge and powerful monoliths’) Social Networks simply don’t have access to other information to tell them otherwise unless we spoon feed it to them… Or do they?

To Index or Not Index – That is the Question

If Google are the only ones where no corrective action was needed, then it would make sense to assume that this is because they’ve put the two things together – they know about a page that appears to be at a fresh URL that they have not accumulated any shares for before but they also have continuous indexing of the web’s content and also discover the legacy URL, which has the 301 redirect set up to the new URL creating the connection. Therefore, although not immediate, Google will piece things together and aggregate the count on to one object presumably associated with the new URL.

The question therefore is, if Google can do it, why can’t the others? You would think that indexing all this socially shared content is definitely in their interests.

Twitter, LinkedIn, and Xing have a slightly different model and I could understand it if they didn’t index this content although I would also be surprised if they truly didn’t.

Focusing on Facebook specifically, you are able to search public objects in the Open Graph, which implies that they already index the content. In fact, when you check your legacy URL using the Facebook Open Graph Debugger, it shows clearly that it follows the redirect.

Given all this, I would invite Facebook to comment on why they don’t periodically index in the same vein as Google – it would save Open Graph users a whole lot of hassle.

In the least, if this post doesn’t get any of these services to change, then please don’t fall into the same trap as I did and instead pro-actively plan for this if social shares are important to your site.

Canonical URLs and SEO

As I recently made a foolish mistake, I thought I would share it to help others avoid it in the future.  It was to do with my quest to get certain pages of the Solution Exchange Community platform indexed in Google, Bing, and Yahoo etc.  Specifically, the valuable forum threads.

First of all, it is worth mentioning how these threads are delivered.  The forum itself is an object of the OpenText Social Communities (OTSC) product, which interacts with the Delivery Server through the OTSC XML API.

Therefore, the forum thread pages are dynamically delivered with the shell of the page being the same physical page with the content influenced by parameters.  In this case, I’ve chosen to utilise sensible URL structures that contain the parameters for simplification and SEO.  I mention more about this in this forum post.  The use of rewrite rules in this way for SEO is one of the key values of a Front Controlling Web Server.

As the shell of the page is the same, I initially had the same <title> tag for all threads and thought that this was the problem.  After changing to adapt the <title> value to the title of the forum thread (along with waiting for re-indexing to happen) there was no change.

Finally, through checking the index of Solution Exchange on Bing with a “site:” search, I noticed to my surprise that one of the threads was indexed but was associated with the URL http://www.solutionexchange.info/forum.htm!!!  This was strange due to the fact that externally, the forum thread was only accessible through a URL like http://www.solutionexchange.info/forum/thread/{ID} meaning that I must be explicitly telling the search engines the wrong URL.  

This was the clue I needed to realise that my problem was due to something I had implemented many months before.

To address the potential SEO penalty that the home page of the community was able to be reached through http://www.solutionexchange.info/ and http://www.solutionexchange.info/index.htm, I introduced the use of the following html header link tag – the example below is the home page value but I included this across the whole site:

<link rel="canonical" href="http://www.solutionexchange.info/index.htm" />

You can read more about this on the Official Google Webmaster Central Blog.  In summary, it tells the search engines that this page is to be associated with the given URL and page ranking (or “Google juice”) is to be associated with that and not the entry URL that the crawler bot used.  This avoids the possibility of page ranking for the same page being split across two or more URLs or being penalised for duplicating content across multiple URLs.

With this knowledge, I was able to update the page template that houses this dynamic content to form the correct URL within this canonical link.  Now it’s back to the waiting game to see if the indexes will pick the content and forgive me for positioning different pages as one.

Although a small detail, the end goal and potential gain is huge as it opens up the rich content that continues to grow within the forum for discovery via the big search engines.  This in turn will only help those within the wider community who are not aware of Solution Exchange discover the content, which may help them resolve an issue or encourage them to take part in the community platform moving forwards.

As always, leave a comment or get in touch if you have any questions.

New Wave of Thinking

As Google unleash Wave to the development community, it knows that the secret to its success is crowd sourcing.

It was great to see the presentation of Wave at Google IO and in particular how inclusive those behind the project made developers.  By opening the doors early to those who can extend and enhance the vanilla offering in ways that Google themselves will not have thought of, ready for a proposed launch later this year is simply fantastic.  Unlike players such as Facebook who have retrospectively added to their API, Google have set about this project in a truly open way, which I think wil reap great rewards as Wave will soon become a central way to manage social content.

The presentation in itself is impressive for the way that it neatly sews the various offerings that have made Google so strong in the past few years together – whether that be the on the fly translation capability when chatting in real-time or impressive realtime search result updates of previous conversations.  However, this is only an impressive starting point as Google themselves have built these features on top of the foundation not through bespoke APIs but rather the public and open APIs that they have released details on.

Someone once mentioned something that was very poignant for me and it was “Never think you know more than your audience”.  This has stuck with me and it rings true in many different contexts.  With the openess of the Wave platform, Google have nailed this.  Pro-actively understanding that adoption levels will rely on the innovation locked away within the development community.  Many companies lack the knowledge to know how to unlock this potential within their respective communities, some are simply ignorant but I believe Google will show how this consideration will lead to an explosion of success around the platform.  Enterprise software vendors should pay attention, in the least for the example that will be played out before them – utilise your communities!  Vendors should also look to investigate innovative ways in which they can benefit from the Wave foundation as the potential will differ greatly from organisation to organisation.

I for one will be keen to explore the Sandbox that Google have running already and can think of a number of collaborative scenarios that will help organisations centralise social conversations.