Etomite on php 5.6

This post is on the geeky end of the scale a bit, but might be handy for someone out there in the interwebs.

Back in the mid 00’s I was a big fan of a CMS called Etomite that begat MODX which I still use for some projects. You can read about the death of Etomite and the rise of MODX on wikipedia.

Anyway, I used etomite for an earlier blog site at www.ohmark.co.nz which is the name of an electronics development company I ran from about 1996 till 2004, ish.

When I sold the rights to the then-current products and designs the new owner wasn’t crazy enough to take on the meaningless company name and oddly spent domain so I kept it for my blog.

That blog was used to record progress on a couple of microcontroller projects I was working on at the time and got enough traffic that it was worthwhile putting google ads on the site.

As I was earning one or two cents a week in advertising revenue from the site I kept it online until May last year when I rolled my web servers over to Debian 8 which included PHP 5.6.x.

Unfortunately PHP 5.6 was a bridge to far for Etomite and my efforts were rewarded with a dreaded deprecation error which are quite often journey to no-where to unravel.

Depreated

Deprecated mysql library error in Etomite

I was already running the last version of Etomite released, Version 1.1, and as it was the only site out of a few dozen on that box that did not survive the operating system upgrade I went for the cop-out option and put up a cop-out home page and forgot about it.

cop-out

Cop-out offline message that lasted over a year.

Skip forward to this weekend and I decided to re-visit it as I’m back on the blogging kick again and I still get the odd email about broken links to one of the projects on the site for a CNC stepper motor controller design I posted to some forums.

Long story short, here’s how I fixed up my Etomite install so you can get your crusty old Etomite site working again as well and revel in the y2k feeling of the admin interface.

To silence the error handling, add a new line at the top of index.php:

<?php
error_reporting(E_ALL ^ E_DEPRECATED);
etc...

Then pop down to the executeParser() function and line 605 or there about’s and comment out the handler and reporting calls.

  function executeParser() {
    //error_reporting(0);
    //set_error_handler(array($this,"phpError"));

You should also comment out any other calls to error_reporting in index.php.  I had four of them but I think they were from my original half-hearted attempt to fix the deprecation error in 2015 but they may have been original.

Lastly put an ‘@’ in front of the deprecated mysql_connect statement on line 1323 or just after given you’ve added a new line at the top.

Change:

    if(@!$this->rs = mysql_connect($this->dbConfig['host'], $this->dbConfig['user'], $this->dbConfig['pass'])) {

to be:

    if(@!$this->rs = @mysql_connect($this->dbConfig['host'], $this->dbConfig['user'], $this->dbConfig['pass'])) {

And your Etomite will rise from the ashes, sorta, ish.

Now that you’ve got Etomite running again, shift the website to something else before you go much further.  There are some common open-source components in Etomite that have had long published exploits which could bite you in the proverbial bum if you leave it online.

At the very least make the entire site read-only to protect against the TinyMCE injection issues which surfaced after Etomite last received an update.  I’ve made mine read-only and have an IDS monitoring for file system changes but it is not what I’d call a ‘trusted’ site on the server and I’ll probably chroot it as well.

There was an attempt for a couple of years to get Etomite moving forwards called etofork on github but it seems to have died and if you want a similar CMS toolset MODX is the way to go now, or if your site was a blog you could go where everyone else seems to have and use WordPress.

For my part I’ll probably move the content to this site, as maintaining two blogs is kinda silly, but given my on-again, off-again blogging style that might have to wait another year or so. 🙂

Do you want me as a customer or not?

I just had to assume the rant position on this one. I just had the worst web site usability experience ever. Well, maybe that’s an exaggeration. The worst website usability experience in over a week. A month at the outside.

I signed up for a free trial of an online software solution, as you do, and wanted to ask the customer service department if they supported PayPal as a payment method as that’s my preferred mode of operation for online stuff. Seems to fit well: online service, online payment.

Contact form

The unbelievable, stupid contact form

Off to the contact form I go. The US 1800 was unattended as it’s out-of-hours right now but there was what I thought would be a helpful link to ‘Contact Sales’. Man, was I wrong.

If you’re running a company, what comes first? Getting the customer to engage with you, or nit-picking at them to fill out stupid forms? Hands up all those who say filling out forms is the way to go. Back of the class, all of you…

The helpful contact form in this case had morphed into 12 mandatory fields, with a particularly annoying pop-up on submission when you didn’t fill in a relatively irrelevant bit of information. What’s up with these people?

At least there was some gratification to be had though. Their website has one of those nifty semi-anonymous feedback tools, which as it happens was not written by a usability challenged developer and let me pen an abbreviated version of this rant right there.

This gaff was after I was already annoyed at having to supply credit card info to get access to the free trial in the first place, which is another really odd synthetic hurdle to put in the way of prospective customers.

To slip sideways into sports jargon their website is almost an own goal, and if it weren’t for the excellent quality of the product these two niggles would definitely count as three strikes.

Mandatory fields

You must be kidding. Mandatory fields a plenty!

I got this far down the article thinking I wouldn’t name the website, but what the heck, this might serve as a review of sorts for some people: The service in question is GoToAssist Express.

I’ve been evaluating their remote support tool against a few of their competitors and it is head and shoulders better than many of the ones I tried and at nearly half the price of the elephant in the market it’s very good value.

I’ll be purchasing a subscription for work despite their website, not because of it. Maybe they were growing too fast and felt they could reduce the growing pains by annoying prospective customers?

Rant off.

Random password generator update

Firstly a big thanks for the feedback I’ve received on the random password generator I stuck on the site a wee while ago, it’s had quite a bit of traffic so I’m going to assume it’s been of use to more than just myself!

I’ve fixed a minor bug where occasionally it would produce a password shorter than the length selected, which caused confusion for at least one person. To be honest I noticed it quite early on when I was testing and ignored it.

The second update is slightly more interesting. Grant from over the ditch in Australia pointed out that in the default setting of 9 characters with upper and lower case plus numbers there was often only one number in the password, where he felt there should be three on average.

And he would be correct, but I didn’t take the weighting of numbers vs letters when I wrote the generator. The problem being that there are 26 letters but last time I looked there were only ten numbers. The original code only used one instance of each number in the source string, so you were 2.6 times more likely to get a letter than a number, 5.2 times if you include upper and lower case.

I’ve fixed that up with a subtle update that uses 30 numeric characters in the source string, which gives relative likelihood of upper, lower and numbers of of 31.7%, 31.7% and 36.6%.

Along with that I’ve adjusted the punctuation string component to give more even distribution of punctuation if you select ‘Full Noise’.

Website Indexation on Google Part one

The web site at work has many issues, and one of the slightly vexing ones was that a site: search on google only showed 540 odd of the 1100 pages in our site map. Google webmaster tools was showing 770 pages indexed, but that still left 400 pages missing in action.

I’m a realist and understand that google will never index everything you offer up, but we also have the paid version of google site search and it can’t find those pages either which is a little more annoying as that means that visitors who are already on our site might not be able to find something.

The real problem with partial indexation is where to start. What is it that Google hasn’t indexed exactly? How do you get the all seeing google to tell which of the 1100 pages are included, or not, in organic search results?

I spent a few meaningless hours on the Google webmaster forums plus a few more even less meaningful hours scraping through various blog posts and SEO sites which led me to the conclusion that either I was searching for the wrong thing, or there was no good answer.

At the tail end of the process I posted a question on the Facebook page for the SEO101 podcast over at webmasterradio.fm, which incidentally I recommend as a great source of general SEO/SEM information.

After a bit of a delay for the US Labour day holiday the podcast was out, and I listened with great interest in the car on the way to work. Lots of good suggestions on why a page might not be indexed, but no obvious gem to answer my original question. That being how to tell what is and what isn’t being indexed.

Luckily for my sanity Vanessa Fox came to the rescue in a back issue of ‘office hours’ another show on webmasterradio.fm. Not a direct solution to the problem, but an elegant way to narrow things down, by segmenting the sitemap.

One Site, many sitemaps

One Site, many sitemaps

In a nutshell; chopping the site map up into a number of bits allows you to see where in the site you might have issues. With only 1100 pages I could probably have manually done a site:search for each URL in a shorter time than I wasted looking for a soltion, but then I’d not have learnt anything along the way, would I?

So leading on from that, I thought I’d post this here on my site with one or two relevant keywords so that anyone else with the same question stands a chance of getting to the same point a little more quickly than I did!

As for the pages that were not indexed? A chunk of our news pages, which may be due to javascript based pagination of the archives, and a fair chunk of the popup pages which I’ve yet to full investigate.

Onwards and upwards.

Another bite of the apple: Postcripts for the electronic age

Out in the real world where sales people send physical letters it’s common practice to use a postscript, or P.S. at the foot of a letter or proposal. It increases engagement with the item, and gives you a little extra punt at the end of your message.

Truth be known you’ll probably find that a percentage of people read the postscript first as it stands out at the bottom of the page. This is by design with companies deliberately folding material into the envelope so the order of viewing is letterhead, footer and finally the body of the text.

It works either way around; If you see it first it distracts you from all the fine print in the body, or if you do read it last it’ll help to seal the deal with a cherry on the bottom. Either way, it’s a powerful addition to the letter when used with care.

Next time you get one of those annoying Readers Digest sweepstake mailers because your cousin signed you up, don’t throw it out.

Open it up slowly and think about what you see first, they spend a great deal more energy and money on designing their mail outs than most other companies and have been known to abuse the awesome destructive power of a P.S. more than once per mail-out.

So, how to get the same little kick in the tail for your emailed sales pitch?

The nature of email is that you see the header and then scroll down a bit, maybe. A foot note in the electronic age is going to viewed last, if at all.

Without any research done I’m willing to bet that a huge percentage of emails get skimmed and you have maybe the first ten lines of text to deliver a message before the reader falls asleep, or hits delete.

You can’t hide your laurels at the bottom of an email in the same way. Special offers need to be up front and above the fold to ensure you engage the customer before your 20 seconds are up and your hopes of early retirement evaporate into the recycle bin.

What you really need is a second bite of the apple, and you need to grab a bit more precious engagement time to harness P.S. goodness and avoid being part of the mindless information consumption culture.

So why not just grab the fruit and chomp away? Send a second email a short time after the main pitch. Keep it light, a few quick lines to say “Oh, by the way, I forgot to tell you about…”

Make sure the whole thing fits on the screen without scrolling to increase the chances of it getting read. Go as far as dropping your ten line signature file, disclaimer and silly environmental message.

Saying ‘Cheers’ is more than enough thank you very much.

The funny thing is that most people read their newest emails first. So for a lot of people your stealthy double send has the same effect as the cleverly folded parchment lovingly slipped into an envelope.

You can take this a wee bit further and learn from some often quoted thinking in the customer service industry. The theory goes that if you make a minor error and correct it quickly and efficiently you’ll see better conversion and retention of customers than if you just mechanically roll out ‘good’ service.

So: By all means send out your emailed proposal in a form email with terms and conditions, formal quotation and twenty five reasons why your widget or service is better than Joe Bloggs. The thing you also need to do is forget the attachment. That’s right, forget it.

Follow up 5 minutes later with a very quick email, 3 or 4 lines tops. “Sorry, I forgot the attachment..” Now you’ve got your foot note in, you’ve corrected an error and you got a chance to prove you’re human all in one shot.

P.S. If you read this far, my entire premise is probably flawed.

P.P.S. Would a third email be too much? I think so.

Telecom New Zealand DNS Fail

This tickled my fancy, so just had to write a up a few words about it.

Telecom is the largest telco in New Zealand, and it appears they can’t run a robust DNS setup for their own domain. Their ISP, Xtra, had some problems last year with their DNS, which caused problems for many of it’s customers, but this time around it’s their own corporate domain, telecom.co.nz that has fallen into the cyber bit-bucket.

I was looking for a bit of info on mobile plans, as you do, and went to surf to the site and got a timeout. Odd I thought but it might be my connection so I lept on the VPN to work. Nope, same result. Log into a cloud server in the US. Nup. No DNS entries.

A quick look at the dnc website at www.dnc.org.nz tells me that they’ve paid their bill, so my cunning plan of registering their domain under my name if they’d let it lapse was dashed. However the odd thing is that the largest telco in New Zealand appears to be running their two DNS servers on the same subnet.

100707-telecom-name-serversThat’s not such an issue unless that subnet is out. Which it appears to be right now. Neither DNS server is pingable, which might be the normal state of affairs, but ‘dig’ can’t talk to them either, which is an indication that not all is right in the world of Telecom DNS servers.

So, Mr Telecom, might be time to get with the rest of the world and host a secondary DNS with another provider. Telstra or Vodafone maybe?

100707-digging-telecom-dnsAlternately, I could run one for you in the US for a few bucks a month, it’d save quite a bit of egg on face me thinks.

Underscores vs Hyphens and an Apology

If you read my blog via an RSS reader you probably noticed at few odd goings on earlier today. I changed a few things on the site and all of the posts going back to last year appeared as new again, even if you’d read them.

Sorry ’bout that, but there was a method to my madness, or at least a method to my fiddling.

Although it’s not entirely obvious, one of the main reasons I started running this site was to mess around with search engine optimisation and try out the theories of various experts who also run a blog but with a great deal more focus that me.

To that end, I’ve re-written the code that generates my rss feed, and included some in line formatting to make it easier to read. Now when you read the blog from a feed reader it should look a bit more like the website, give or take. Well, more give than take.

While some of the changes were purely cosmetic, I also changed the URLs for all my blog posts.

The new URLs is the bit that caused them to pop up as new posts in at least feedburner and Google reader. The change to the URLs was to remove the dates from the URL itself, and replaced all the underscores with hyphens.

Removing the dates was because it just looked ugly compared to the WordPress style of using directories for the year / month. I don’t use WordPress, but decided that if I was going to mess with all my URLs I might as well change to nicer looking ones while I’m at it.

If you do some searching for “hyphen vs underscore in URLs” using your favourite search engine you’ll find a bunch of writing, with the general wisdom falling on the side of hyphens. In fact as far back as 2005 Matt Cutts, a developer from Google, blogged about it. [1]

So why might you ask did I use underscores? Well. Ummmmm, cause it’s what I’ve always done is the only answer I’ve got.

A bit more searching around told me that the results for at least Google are apparently different between the two methods. Using underscore caused URLs to be considered as phrases and hyphens were more likely to result in search results for individual words in the URL.

This sounded like something worthy of some experimentation so wearing my best white lab coat I created some pages on a few different sites I look after which were not linked to the navigation but were listed in the xml sitemaps.

I mixed and matched the URLs with underscores and hyphens and used some miss-spelt words and phrases. There were a total of 48 pages, spread over 8 domains, which were all visited by googlebot a number of time over an eight week period.

I had a split of twelve pages with hypens and matching content, twelve with hyphens and unmatched content, and the same split with underscores. Where the content matched I used the same miss-spelling of the words to get an idea of how well it worked. All six of the sites have good placement of long tail searches for their general content and get regularly spidered.

The end result is that most of the hyphenated URL pages that did not have matching keywords in content or tags were indexed against individual words in the URL (eight out of twelve). All of the pages that had hyphenated URLs and matching keywords in the content were indexed against those words.

The pages with underscores and non-matched content didn’t fair so well. Only four out of the twelve pages got indexed against words in the URL, although nine of them were indexed against long-tail phrases from the URLs. Pages with underscores and matching content ranked lower for keywords in the URL than the hyphenated ones although that’s not an accurate measure as they were miss-spelt words on pages with no back links.

So, end result: The common wisdom of using hyphens would appear to be valid and helpful if you’re running a site where long keyword rich URLs make sense, and the strength of the individual keywords might be more valuable than the phrase.

If you’re going for long tail search results in a saturated market where single keyword rank is hard to gain, you might want want to mix it up a little and try some underscores, it certainly can’t hurt to try it.

One thing to note for those not familiar with why this is even an issue. Spaces are not valid in the standard for URLs although they are common in poorly or lazily designed websites. If you’re really bored you can read the original spec by Tim Berners-Lee back in 1994 [2], or the updated version from 2005, also by Mr Berners-Lee. [3]

The long an short of that in this context is that you can use upper and lower case letters, numbers, hyphens, underscores, full stops and tildes (‘~’). Everything else is either reserved for a specific function, or not valid and requires encoding. A space should be encoded as ‘%20’ and you can probably imagine how well that looks when trying to%20read%20things.

If you type a URL into your browser with a space the browser is converting it to ‘%20’ before sending it down the pipe for you. You sometimes see these encoded URLs with no just spaces but other random things in them, and they can be the cause of random behaviour for some websites and software, so it’s best to avoid odd characters in your URLs.

Apologies again if you got some duplicates in your RSS reader over the last few hours. I’ll try not do that again, and it’ll be interesting to see if a couple of pages that were being ignored by Google with underscores get indexed now.

References:

Matt Cutts Blog posting from 2005 http://www.mattcutts.com/blog/dashes-vs-underscores/
1994 spec for URLs http://www.ietf.org/rfc/rfc1738.txt
2005 update to URL Spec http://www.ietf.org/rfc/rfc3986.txt

Javascript Compression with Apache 2 and Debian Etch

If you’re trying to wring every last drop of performance out of a website you’re probably wanting to compress all your content before it hits the wire. While I was messing about with another project I noticed that the javascript from this blog wasn’t getting compressed.

If you just want the solution to the issue, skip to the bottom of this post, but for those interested in the finer detail, read on.

This site uses Apache 2 on Etch, and after a bit of Googling I didn’t really find a direct mention of this issue, so I though I’d slap it on here for other folks afflicted with un-compressed javascript.

First step is to enable mod_deflate in the first place, which will by default compress html, xml, css and plain text files, but due to a config issue will not get your javascript.

The command to enable mod_deflate is: ‘a2enmod deflate’. If you’re on shared hosting without this configured you’ll have to drop your support folks an email, although I’d think most shared hosting companies would be well on top of the config of mod_deflate as it saves them money!

The reason javascript is not compressed by default is that the mime-type specified in /etc/apache2/mods-available/deflate.conf for javascript is ‘application/x-javascript’.

Apache dosn’t know what extension is associated with this mime type. The mime-types used by apache are defined in /etc/apache2/mods-available/mime.conf by including /etc/mine.types.

/etc/mime.types in turn has the .js extension associated with application/javascript, not x-javascript.

To fix this up you’ve got a some options:

  • Change /etc/mine.types to be application/x-javascript, which might break other applications that include that file.
  • Change /etc/apache2-mods-available/deflate.conf to use application/javascript
  • Add the mime type ‘application/x-javascript’ to /etc/apache2/mods-available/mime.conf with the line: ‘AddType application/x-javascript .js’ which is the option I took.

After adding the line do a /etc/init.d/apache reload and you’re in business, all .js files leaving your server will be compressed if the browser reports that it accepts compressed files.

You could of course pre-compress the files and serve them using .gz extensions, or control specific compression rules using a .htaccess file, but I wanted server-wide compression without needing to specifically configure sites on the server.

Taking joy from simple news: IE6 and Youtube

Anyone who has anything even remotely to do with web development will be smiling at the news that Youtube is going to discontinue support for IE6.

Not only that, we’ve got a date. 13th of March, 2010.

While this isn’t really the end, it will certainly put that little bit more pressure on the roughly 15-20% of internet users who still cling to the 9 year old version of Internet Explorer for various reasons I fail to fully comprehend.

You can read more about this on mashable.com or techcrunch.com as they do a much better job than me of writing about such things.

There’s even a response from Microsoft if you so wish to expand your mind.

Having worked in a corporate IT environment I fail to see how even the most lethargic of firms could take 5 years to update the web browser in the modern business environment. Unless you’re talking line of business PC’s in a secure network, but then those PC’s shouldn’t be afflicting their attempts at html rendering on the web development community.

I thought when Facebook stopped explicitly supporting the nearly decade old browser in 2008 that we’d seen then end of it. Then Microsoft shattered the hopes of many geeks, confirming support would continue into 2014.

With Youtube being the number 3 site on the web I’m going to take a punt and say that at least some of the 15% will be getting the message loud and clear soon that it’s time to update their PC.

Security in the cloud, KISS

The idea of keeping things simple when it comes to server security is not at all radical and cloud servers provide the ability to reach the not so lofty goal of keeping your servers simple and secure without breaking the bank.

The theory is simple: The smaller the number of processes you have running on your box the less there is to go wrong, or attack. This is one area where Windows based servers are immediately at a disadvantage over a *ix server, but I digress.

When I was pretending to be a hosting provider a few years ago I ran colocated discrete servers. They weren’t cheap to own or run, not by a long shot. That cost was a huge enemy of the KISS security concept.

In the process of trying to squeeze every last cent of value from the boxes I overloaded them with every obscure daemon and process I could think of. Subsequently the configuration of the servers became complex and difficult to manage, while applying patches became a cause of sleepless nights and caffeine abuse.

With the cost to deliver a virtual server in the cents per hour and the ability to build a new server in a matter of minutes the barrier to building complex applications with a robust security architecture is all but vanished.

The mySQL server behind this blog site is a base install of Debian Lenny with mySQL, nullmailer, knockd and an iptables firewall script. That’s it. Simple to build, simple to configure, simple to backup and simple to manage. KISS.

A little bit of searching around on hardening up a linux box and you’ll quickly find information on changing default settings for sshd and iptables rulesets which you can combine with small targeted cloud servers to reduce the sleepless nights.

I can’t help with the coffee addiction though, I’m still trying to kick that habit myself!