The Nineties have been calling

For the last three or four years, possibly more there’s been this niggling little noise in my head every time I looked at my own blog.

It’s not that it’s the most important site I look after, and I really only set it up to test ideas and post the occasional rant but it turns out the 1990’s were indeed calling, asking for the website back.

So here we are, in 2014 and I’ve finally embraced HTML5 and CSS3 for my own site two years after the big ‘5’ became a candidate recommendation of the W3C, and in the year it is set to become the recommended standard for websites across the board.

Hang on, what do you mean? Isn’t HTML5 the standard? HTML 5

Amazingly although a large number of websites use HTML5 for their rendering now and it’s been a buzzword for at least five years it’s not actually a recommended standard yet. The W3C plan indicates that will happen this year. 1

This is the reason you hear website developers bemoaning the state of browser ‘X’ and device ‘Y’ rendering their latest creations. Or at least that’s why you heard those noises if you travel in circles frequented by web developers who like new toys.

So, off I went to themeforest and bought me a shiny responsive template, chopped up the source files and slapped it down on top of MODX without too much pain considering how long I put it off.

So far the result has been pleasing although I need to re-code the blog comments bits as they look horrible and there are some nasty kludges going on in the back room to get my old content to work in the new template.

Once I’ve fixed up the last couple of visual elements I suppose I’ll have to fix the validation of the old content as well but who really does that any more?

So here is it, my first post in the new template. It remains to be seen if it injects some enthusiasm so I start posting regularly again. Only time will tell.

  1. W3C 2014 plan

Should this be forgotten?

A tale of my slightly oblique involvement in transgender employment experiences.

The recent news about the ruling in Europe around the concept of the ‘right to be forgotten’ got me to thinking it was time I did a bit of ego surfing earlier on tonight.

For those not entirely familiar ego surfing is putting your own name into Google or another search engine of your choosing if you are so inclined.

For me it’s somewhat topical I suppose as I’m currently contemplating a change in career and I’m sure some prospective clients or employers will be searching me out and finding all sorts of dregs around the internet.

I know of only four other people on the planet with my name so it’s not like I can even use the ‘Bill Smith’ defence and claim it wasn’t me who did whatever it is that the search engines find attached to my name.

As a curious segue I’m facebook friends with one of the other Chris Hellyar’s who resides in the UK. He too is employed in IT, has a beard and glasses. Small world, apparently.

Book CoverAnyway, I’m ego surfing earlier tonight and there are always lots of hits as I’ve been a netizen for a while now and although I prefer to keep my private parts that way a lot of my club activities and work life is available in bits and bytes somewhere.

Part of my working world is stock photography. You know the stuff; photos of smiling people doing happy things. Photos of books, logs, pins, ducks… Stock photography is the essential furniture which graces the virtual and print media world with images of almost anything you can think of.

I see my images popping up all over the place. Bank websites, book covers, advertising for shoes, travel, health conditions, pet food.

Some of the hits are often obscure and tonight I found one that really caught my attention a listing under my name in the search results for ‘transgender employment experiences’.

After a bit of hesitation I clicked on the link to find a fairly serious bit of literature by a Kyla Bender-Baird with one of my images on the cover no less. Google had found the copyright notice for the cover image a couple of pages into the book.

I’m not fussed to be honest and I’m happy that Kyla’s publishers honoured the copyright terms and credited the image. I do wonder what a prospective employer might thing of that in the search results though, given I have a beard and I like to think I’m a pretty manly sort of figure!

Is this the sort of search result that the complainants in the EU want to have forgotten? I do wonder.

(You can see the preview to the book on google books here. Here is the book publishers page on the book if you wound up here looking for the book!)

When the Cloud goes bad

Before you read too much further this is a bit of a self-serving rant, bought on indirectly by the heartbleed bug published a couple of weeks ago.

I’m a fan of cloud hosted virtual Linux servers for hosting and messing about of the geeky kind.

The ability to spin up a box and play with it to your hearts content then just shut it off again for cents an hour has to be one of the greatest enablers of growth in online services since caffeine laden fizzy drinks hit the market.

I evangelised the use of cloud boxes for web hosting a while ago with some of my geeky friends, which of course rubbed off onto some of the not-so-geeky ones as well.

That’s where the cloud goes bad, right there, just at the end of that last sentence.

I even went as far as to help a couple of the not-so-geeky folks set up cloud servers and migrate their websites in a flurry of command line goodness with bash scripting that is in fact a native language for sandal wearers in many countries.

That was 2009 and skipping forward five years I now seem to have become the Linux agony aunt for a few of the converts and even some of the ones I considered to have strong bash-foo had dug a hole so deep that apt-get and yum can’t rescue them.

One whole day, almost to the minute, after CVE-2014-0160 was published and about four hours after I’d finished the initial patching of all *ix boxes for work my inbox was showing signs of the darker side of the cloud.

The questions were varied, to be fair, but all symptoms of the same underlying issue. “Is my server vulnerable?”, “My server wont update, how do I upgrade it?”, “I googled it but apt-get gives me xxxx error”.

Some of these machines had not been updated by their well meaning owners since they were spun up, in some cases over five years ago. The distribution in the case of Debian 5 now unsupported. This is my problem how?

The next morning I was due to start a three day holiday in the form of being parent helper on a school camp so I really didn’t want to be a part of someone else’s IT dilemma. For work I would keep an eye on things while away, but the other folks? Hmmm.

I replied with a pretty generic message to all of them, along the lines of ‘Patch OpenSSL, replace your SSH keys and if you’re using https get the certificate re-issued’.

So, instead of just relying on the trusty iPhone to keep up I ignored the rules for school camps and packed my netbook along with my toothbrush, socks and bug repellent. My plan being to get online tethered to the phone at least a couple of times during the break.

Well, you could have knocked me down with a feather. About midnight on the Wednesday I get online and there’s about forty emails waiting for my weary eyes.

Half of them unhappy replies to my apparently useless advice and the other half from more people who clearly shouldn’t run public facing servers on the internet without first putting on their overcoats.

Where I had credentials I made a half-hearted effort to get on and update OpenSSL in the wee hours between mountain biking, making hundreds of filled rolls and hut building in the rain but generally I just deffered them as my enthusiasm for cloud computing was being sorely tested. I’m not a very good agony aunt it seems.

And here’s the rant, in a nutshell:

If you don’t know how to keep a server secure and up to date you should not be running your own virtual servers online. Windows or Linux I don’t care. Just don’t do it.

If the concept of migrating applications or websites between releases of Linux Distro is foreign, or you think a public key is something used by kids to sneak into the school pool, get yourself some cheap shared hosting. Virtual servers are not for you.

If you think the scope for a Windows firewall is a tube with mirrors at each end allowing you to peer over the wall…. Well, you get the idea.

I hereby retract all of my former enthusiasm for cloud hosted virtual Linux servers.

</rant>

Authorship, Small words and little tags that do good

How’s that for a confused, or at least confusing article title?

I posted a blog article last week about some DIY stuff which wasn’t particularly noteworthy and truth be known I just wanted to post something to see if I could test a fix for the authorship tags on the site.

Back when authorship was just a toddler in the Google suite of obscure and not so obscure tags I went with some advice from somewhere to put a link with ‘rel=author’ on every blog post page to my profile page and slap a link on the profile page to my Google+ profile and I’d be done.

That worked for about, well. I’m not entirely sure it did. For exact match entire passages and phrases from my posts I’d sometimes seen my face staring back at me from the search results, but mostly nothing changed.

At work however we have a blog contributor who is consistently showing up as his miniature self smiling beside search results for his posts even though none of the requisite link tags are in place.

We have no links to his Google+ profile anywhere on the site and the only part of the authorship puzzle that’s been met is the contributor entry on his Google plus page.

I’m not going to go into any detail about how to make authorship work, there are a lot of good articles around the web on how that can be done and Google’s own help pages are as good as any now that it’s well established.

After the page was indexed fully I ran a range of different test searches which told me that authorship was working along with confirming a bunch of other odds and sods that should be common knowledge if you’re in the online marketing game.

What I found interesting though is how subtle search phrase changes changed when authorship shows up in the results or when it doesn’t. Equally I discovered some small words that made differences as well when I normally wouldn’t expect it.

So, without further delay, a pile of search results screenshots with comments for each…

130825-01First up we have a mixed up phrase from the blog post, and I’m top result. That’s mission 1 achieved, the page is indexed and we can move onto testing some other ideas out.

As a group of keywords ‘portable risks side note’ is not that stunning but you can see immediately how less than ethical SEO companies might convince a customer that a set of keywords are critical and get a rank for that combo under the guise of long-tail search. Followed quickly by the bill and a rapid exit to the nearest hills.

Long story which I can’t really post about, but I recently helped a friend with exactly that problem who’d paid handsomely for an SEO consultant to get their pages to rank well for a totally useless set of keywords.

This stuff is not rocket science but if you want to be top hit for ‘used car’ that is a whole other can of worms and requires a lot more effort as the content I’m using for these test searches is not really what happens in the real world.

An interesting thing to note about this search result is that the snippet of text is not the meta description for the page.

SEO tidbit #1 from this blog post: No matter how much time you spend crafting the description tag it may not show up in the serps these days if the search terms don’t match the description.

Oh, and the authorship worked. Who’s that attractive looking chap beside the search result?

130825-02

I did a bit of messing about with combinations of keywords and found that this one still gave second place result but dropped my authorship. Again the search phrase itself is pretty meaningless but it highlights something about Authorship.

If Google doesn’t think who wrote the article is that important to the search results you wont get the extra credibility in the search results page. That means if you’re struggling with testing the markup pay a bit more attention to what you see in Google’s structured data testing tool and what you’re content is about rather than just trying to get your photo up on what you think the page should rank for.

Note that the snippet is different again. Still nothing from the description tag. Instead this time we have a mash-up from two paragraphs highlighting where the algorithm says the keywords were found within the body of the content.

130825-03

A simple change here. Removed ‘on’ and there’s 70,000 or so more results found in the index but it doesn’t change the top few results. The fact is that small words sometimes don’t matter, despite how much your english teacher might have insisted otherwise.

Clearly if you were prepared to click a few more pages into the results you’d see a difference though, so let’s try something different.

130825-04

Same words with the ‘on’ back in the mix with a different order and we’ve dropped a couple of hundred thousand potential results even though the top three results have not changed.

So, the order of small words does matter. It would seem that the combinations of ‘on side’, ‘on note’ and ‘note on side’ are probably more common in content than ‘on portable’.

I’m obviously mincing my words, almost literally, to make a point here.

When in the English language you write, order important it is. Unless you’re Yoda that is.

Google have long said that well crafted content is important and phrasing that is common to your target audience is going to rank better than the best writers missive or random words on a page that used to be common in the AltaVista days.

As a total aside, if you’re interested in SEO and don’t know what I mean by AltaVista days, you missed out on a golden age for SEO consultants that allowed people to do all sorts of things that would get them kicked from the index of even the slackest engine now. Ahhhh, those were the days.

130825-05

Another shuffle of keywords and the third result has vanished down to about position six although cbsnews and I are still batting pretty well for some obscure text.

‘Notes on’ in this case is what starts the page title tag and the first H1 on the page for the result that’s popped up to number three on the hit list.

That right there is old-school SEO advice. Have relevant title tags and heading structures with text people will search for. If your page is about tomatoes having the page title ‘Shoe leather replacements for tomatoes’ and the first H1 tag the same will probably get you more search traffic for shoe leather than it will tomatoes.

130825-06

One more shuffle of keywords and this time a more correctly constructed phrase from an English point of view and it’s got four of the five words in the same order as my post so the dashing fella on the left of the search makes a sudden re-appearance.

So even though this is not an exact match to the text the algorithm calculates that the order makes better sense and is more likely to be well structured content deserves that little bit of extra attention the authorship gives.

cbsnews.com is still there but lets face it… If my site had as much link juice as a major news site I’d have Google adsense on here and be counting my sports cars parked in the garage of my French Riviera holiday home not writing this for entertainment.

The osha.gov site appearing there is interesting, but again .gov sites have credibility oozing from their TLD so nothing surprises me when I see them showing up in search results.

130825-08

Now for a little image searching using ‘testing FT-857’ seems like a pretty good image search term if you’re into amateur radio and want to find out about the FT-857.

The image is result four which is a good slot and your SEO handbook will tell you the image names are all important for such things and the alt tags. Don’t forget the alt tags.

In this case the alt tag is indeed ‘Testing on the FT-857’ and searching for exactly that will bring the image up to the top hit, not the lowly number four slot.

What about that image name? It’s actually ‘130818-171341-0001.jpg’.

Correct and contextual naming of images is a good idea but don’t forget the auxiliary tags around images. The only place FT-857 appeared before this post on my entire website is in the alt and title tags for that image.

130825-09

Better than that, this search gets me top hit for a a combination of keywords from the page and FT-857 which only appears in the alt tag for the image and the title tag for the link to the popup copy of the image.

If I’d bothered to name the image in a useful fashion I could probably rank for some useful phrases as well as that one. This is basic stuff but day in day out I see SEO advice about all sorts of other things. Getting the basics right on this is going to get me traffic for people testing FT-857 Radios with power pole connectors.

130825-10

One last screenshot to round out the observations for the evening. An image search for ‘gel FT-857’ showing a top hit for my photo. The word ‘gel’ is not in the alt tag for the image, but it is in the title attribute for the link to the popup.

If you hang plain english title tags on links to images and content you can improve their positioning for key words and phrases in the linked content or in this case can give you a ranking for a term that does not exist anywhere in the content apart from the tag.

By way of a disclaimer and for the sake of completeness: I did these searches from a New Zealand IP on www.google.co.nz, using google chrome in incognito mode to avoid search history slanting the results. Your results may vary if you’re in a different country of have substantial search history for similar terms or sites. Some of them were on my Ubuntu Desktop and the balance on a Windows 7 laptop, because I happen to be sitting in front of the telly pretending to watch something, so the fonts look slightly different in some of the screenshots.

(I did do a bit of testing from a US IP using google.com in incognito mode and got very similar results, although the serps were slightly different the observations would be the same. If you’re reading this more than a week after I wrote it the search results will probably have changed, the web is a dynamic place.)

Do you want me as a customer or not?

I just had to assume the rant position on this one. I just had the worst web site usability experience ever. Well, maybe that’s an exaggeration. The worst website usability experience in over a week. A month at the outside.

I signed up for a free trial of an online software solution, as you do, and wanted to ask the customer service department if they supported PayPal as a payment method as that’s my preferred mode of operation for online stuff. Seems to fit well: online service, online payment.

Contact form

The unbelievable, stupid contact form

Off to the contact form I go. The US 1800 was unattended as it’s out-of-hours right now but there was what I thought would be a helpful link to ‘Contact Sales’. Man, was I wrong.

If you’re running a company, what comes first? Getting the customer to engage with you, or nit-picking at them to fill out stupid forms? Hands up all those who say filling out forms is the way to go. Back of the class, all of you…

The helpful contact form in this case had morphed into 12 mandatory fields, with a particularly annoying pop-up on submission when you didn’t fill in a relatively irrelevant bit of information. What’s up with these people?

At least there was some gratification to be had though. Their website has one of those nifty semi-anonymous feedback tools, which as it happens was not written by a usability challenged developer and let me pen an abbreviated version of this rant right there.

This gaff was after I was already annoyed at having to supply credit card info to get access to the free trial in the first place, which is another really odd synthetic hurdle to put in the way of prospective customers.

To slip sideways into sports jargon their website is almost an own goal, and if it weren’t for the excellent quality of the product these two niggles would definitely count as three strikes.

Mandatory fields

You must be kidding. Mandatory fields a plenty!

I got this far down the article thinking I wouldn’t name the website, but what the heck, this might serve as a review of sorts for some people: The service in question is GoToAssist Express.

I’ve been evaluating their remote support tool against a few of their competitors and it is head and shoulders better than many of the ones I tried and at nearly half the price of the elephant in the market it’s very good value.

I’ll be purchasing a subscription for work despite their website, not because of it. Maybe they were growing too fast and felt they could reduce the growing pains by annoying prospective customers?

Rant off.

Website Indexation on Google Part one

The web site at work has many issues, and one of the slightly vexing ones was that a site: search on google only showed 540 odd of the 1100 pages in our site map. Google webmaster tools was showing 770 pages indexed, but that still left 400 pages missing in action.

I’m a realist and understand that google will never index everything you offer up, but we also have the paid version of google site search and it can’t find those pages either which is a little more annoying as that means that visitors who are already on our site might not be able to find something.

The real problem with partial indexation is where to start. What is it that Google hasn’t indexed exactly? How do you get the all seeing google to tell which of the 1100 pages are included, or not, in organic search results?

I spent a few meaningless hours on the Google webmaster forums plus a few more even less meaningful hours scraping through various blog posts and SEO sites which led me to the conclusion that either I was searching for the wrong thing, or there was no good answer.

At the tail end of the process I posted a question on the Facebook page for the SEO101 podcast over at webmasterradio.fm, which incidentally I recommend as a great source of general SEO/SEM information.

After a bit of a delay for the US Labour day holiday the podcast was out, and I listened with great interest in the car on the way to work. Lots of good suggestions on why a page might not be indexed, but no obvious gem to answer my original question. That being how to tell what is and what isn’t being indexed.

Luckily for my sanity Vanessa Fox came to the rescue in a back issue of ‘office hours’ another show on webmasterradio.fm. Not a direct solution to the problem, but an elegant way to narrow things down, by segmenting the sitemap.

One Site, many sitemaps

One Site, many sitemaps

In a nutshell; chopping the site map up into a number of bits allows you to see where in the site you might have issues. With only 1100 pages I could probably have manually done a site:search for each URL in a shorter time than I wasted looking for a soltion, but then I’d not have learnt anything along the way, would I?

So leading on from that, I thought I’d post this here on my site with one or two relevant keywords so that anyone else with the same question stands a chance of getting to the same point a little more quickly than I did!

As for the pages that were not indexed? A chunk of our news pages, which may be due to javascript based pagination of the archives, and a fair chunk of the popup pages which I’ve yet to full investigate.

Onwards and upwards.

Another bite of the apple: Postcripts for the electronic age

Out in the real world where sales people send physical letters it’s common practice to use a postscript, or P.S. at the foot of a letter or proposal. It increases engagement with the item, and gives you a little extra punt at the end of your message.

Truth be known you’ll probably find that a percentage of people read the postscript first as it stands out at the bottom of the page. This is by design with companies deliberately folding material into the envelope so the order of viewing is letterhead, footer and finally the body of the text.

It works either way around; If you see it first it distracts you from all the fine print in the body, or if you do read it last it’ll help to seal the deal with a cherry on the bottom. Either way, it’s a powerful addition to the letter when used with care.

Next time you get one of those annoying Readers Digest sweepstake mailers because your cousin signed you up, don’t throw it out.

Open it up slowly and think about what you see first, they spend a great deal more energy and money on designing their mail outs than most other companies and have been known to abuse the awesome destructive power of a P.S. more than once per mail-out.

So, how to get the same little kick in the tail for your emailed sales pitch?

The nature of email is that you see the header and then scroll down a bit, maybe. A foot note in the electronic age is going to viewed last, if at all.

Without any research done I’m willing to bet that a huge percentage of emails get skimmed and you have maybe the first ten lines of text to deliver a message before the reader falls asleep, or hits delete.

You can’t hide your laurels at the bottom of an email in the same way. Special offers need to be up front and above the fold to ensure you engage the customer before your 20 seconds are up and your hopes of early retirement evaporate into the recycle bin.

What you really need is a second bite of the apple, and you need to grab a bit more precious engagement time to harness P.S. goodness and avoid being part of the mindless information consumption culture.

So why not just grab the fruit and chomp away? Send a second email a short time after the main pitch. Keep it light, a few quick lines to say “Oh, by the way, I forgot to tell you about…”

Make sure the whole thing fits on the screen without scrolling to increase the chances of it getting read. Go as far as dropping your ten line signature file, disclaimer and silly environmental message.

Saying ‘Cheers’ is more than enough thank you very much.

The funny thing is that most people read their newest emails first. So for a lot of people your stealthy double send has the same effect as the cleverly folded parchment lovingly slipped into an envelope.

You can take this a wee bit further and learn from some often quoted thinking in the customer service industry. The theory goes that if you make a minor error and correct it quickly and efficiently you’ll see better conversion and retention of customers than if you just mechanically roll out ‘good’ service.

So: By all means send out your emailed proposal in a form email with terms and conditions, formal quotation and twenty five reasons why your widget or service is better than Joe Bloggs. The thing you also need to do is forget the attachment. That’s right, forget it.

Follow up 5 minutes later with a very quick email, 3 or 4 lines tops. “Sorry, I forgot the attachment..” Now you’ve got your foot note in, you’ve corrected an error and you got a chance to prove you’re human all in one shot.

P.S. If you read this far, my entire premise is probably flawed.

P.P.S. Would a third email be too much? I think so.

Telecom New Zealand DNS Fail

This tickled my fancy, so just had to write a up a few words about it.

Telecom is the largest telco in New Zealand, and it appears they can’t run a robust DNS setup for their own domain. Their ISP, Xtra, had some problems last year with their DNS, which caused problems for many of it’s customers, but this time around it’s their own corporate domain, telecom.co.nz that has fallen into the cyber bit-bucket.

I was looking for a bit of info on mobile plans, as you do, and went to surf to the site and got a timeout. Odd I thought but it might be my connection so I lept on the VPN to work. Nope, same result. Log into a cloud server in the US. Nup. No DNS entries.

A quick look at the dnc website at www.dnc.org.nz tells me that they’ve paid their bill, so my cunning plan of registering their domain under my name if they’d let it lapse was dashed. However the odd thing is that the largest telco in New Zealand appears to be running their two DNS servers on the same subnet.

100707-telecom-name-serversThat’s not such an issue unless that subnet is out. Which it appears to be right now. Neither DNS server is pingable, which might be the normal state of affairs, but ‘dig’ can’t talk to them either, which is an indication that not all is right in the world of Telecom DNS servers.

So, Mr Telecom, might be time to get with the rest of the world and host a secondary DNS with another provider. Telstra or Vodafone maybe?

100707-digging-telecom-dnsAlternately, I could run one for you in the US for a few bucks a month, it’d save quite a bit of egg on face me thinks.

Underscores vs Hyphens and an Apology

If you read my blog via an RSS reader you probably noticed at few odd goings on earlier today. I changed a few things on the site and all of the posts going back to last year appeared as new again, even if you’d read them.

Sorry ’bout that, but there was a method to my madness, or at least a method to my fiddling.

Although it’s not entirely obvious, one of the main reasons I started running this site was to mess around with search engine optimisation and try out the theories of various experts who also run a blog but with a great deal more focus that me.

To that end, I’ve re-written the code that generates my rss feed, and included some in line formatting to make it easier to read. Now when you read the blog from a feed reader it should look a bit more like the website, give or take. Well, more give than take.

While some of the changes were purely cosmetic, I also changed the URLs for all my blog posts.

The new URLs is the bit that caused them to pop up as new posts in at least feedburner and Google reader. The change to the URLs was to remove the dates from the URL itself, and replaced all the underscores with hyphens.

Removing the dates was because it just looked ugly compared to the WordPress style of using directories for the year / month. I don’t use WordPress, but decided that if I was going to mess with all my URLs I might as well change to nicer looking ones while I’m at it.

If you do some searching for “hyphen vs underscore in URLs” using your favourite search engine you’ll find a bunch of writing, with the general wisdom falling on the side of hyphens. In fact as far back as 2005 Matt Cutts, a developer from Google, blogged about it. [1]

So why might you ask did I use underscores? Well. Ummmmm, cause it’s what I’ve always done is the only answer I’ve got.

A bit more searching around told me that the results for at least Google are apparently different between the two methods. Using underscore caused URLs to be considered as phrases and hyphens were more likely to result in search results for individual words in the URL.

This sounded like something worthy of some experimentation so wearing my best white lab coat I created some pages on a few different sites I look after which were not linked to the navigation but were listed in the xml sitemaps.

I mixed and matched the URLs with underscores and hyphens and used some miss-spelt words and phrases. There were a total of 48 pages, spread over 8 domains, which were all visited by googlebot a number of time over an eight week period.

I had a split of twelve pages with hypens and matching content, twelve with hyphens and unmatched content, and the same split with underscores. Where the content matched I used the same miss-spelling of the words to get an idea of how well it worked. All six of the sites have good placement of long tail searches for their general content and get regularly spidered.

The end result is that most of the hyphenated URL pages that did not have matching keywords in content or tags were indexed against individual words in the URL (eight out of twelve). All of the pages that had hyphenated URLs and matching keywords in the content were indexed against those words.

The pages with underscores and non-matched content didn’t fair so well. Only four out of the twelve pages got indexed against words in the URL, although nine of them were indexed against long-tail phrases from the URLs. Pages with underscores and matching content ranked lower for keywords in the URL than the hyphenated ones although that’s not an accurate measure as they were miss-spelt words on pages with no back links.

So, end result: The common wisdom of using hyphens would appear to be valid and helpful if you’re running a site where long keyword rich URLs make sense, and the strength of the individual keywords might be more valuable than the phrase.

If you’re going for long tail search results in a saturated market where single keyword rank is hard to gain, you might want want to mix it up a little and try some underscores, it certainly can’t hurt to try it.

One thing to note for those not familiar with why this is even an issue. Spaces are not valid in the standard for URLs although they are common in poorly or lazily designed websites. If you’re really bored you can read the original spec by Tim Berners-Lee back in 1994 [2], or the updated version from 2005, also by Mr Berners-Lee. [3]

The long an short of that in this context is that you can use upper and lower case letters, numbers, hyphens, underscores, full stops and tildes (‘~’). Everything else is either reserved for a specific function, or not valid and requires encoding. A space should be encoded as ‘%20’ and you can probably imagine how well that looks when trying to%20read%20things.

If you type a URL into your browser with a space the browser is converting it to ‘%20’ before sending it down the pipe for you. You sometimes see these encoded URLs with no just spaces but other random things in them, and they can be the cause of random behaviour for some websites and software, so it’s best to avoid odd characters in your URLs.

Apologies again if you got some duplicates in your RSS reader over the last few hours. I’ll try not do that again, and it’ll be interesting to see if a couple of pages that were being ignored by Google with underscores get indexed now.

References:

Matt Cutts Blog posting from 2005 http://www.mattcutts.com/blog/dashes-vs-underscores/
1994 spec for URLs http://www.ietf.org/rfc/rfc1738.txt
2005 update to URL Spec http://www.ietf.org/rfc/rfc3986.txt

Why is it so hard to think up a good password?

I’ve been working in IT for a wee while now, a shade over 20 years even, and in all this time there is one consistent thread of frustration that nibbles away at my very sanity. Trivial Passwords.

I’m sure this isn’t just be going nuts here, there must be thousands of network administrators and web masters going quietly bonkers all over the planet right at this very moment.

We slave away with intimate pride of our collective nerdiness, building robust and secure IT systems for all to behold. Fussing and fettling over minute details to ensure the ever important data is safe.

An unfortunate side affect of this creative journey is a necessary evil. The agent by which all things great in computing are undone. I’m not referring to the trivial password here but that which spawns it. The user.

“Do I have to put a number in my password, eight characters, really?”

I’m getting chills just typing that sentence.

A quick Google for ‘most common passwords’ will quickly reveal the painful truth. 123456 is actually a very common password, as is the word itself… ‘Password’.

Where am I going with all this? I use complex passwords. I love them with the same fervour as tatting enthusiasts like a good yarn. (See what I did there? No? Look it up…)

I used to use an on line password generator, but the owner of the website decided to put pop-up ads on the page, so that every time you refreshed the page you got another ad popup. ARGH.

If you were looking for something particular in your random password it could take ages with all the popping, closing and refreshing going on. Popup ads are second only in their evil nature to trivial passwords.

Fresh from a particularly annoying bout of popping, closing and refreshing last week I set about creating my own random password generator, which is now on line for all to use.

It uses the php rand() function seeded using microtime() which in lay terms means that in theory it can generate a different password every microsecond. Of course if you are a lay person you probably don’t care, and you’re using 123456 as your password.

That in a nutshell is it. Enjoy your randomly generated passwords.