Etomite on php 5.6

This post is on the geeky end of the scale a bit, but might be handy for someone out there in the interwebs.

Back in the mid 00’s I was a big fan of a CMS called Etomite that begat MODX which I still use for some projects. You can read about the death of Etomite and the rise of MODX on wikipedia.

Anyway, I used etomite for an earlier blog site at www.ohmark.co.nz which is the name of an electronics development company I ran from about 1996 till 2004, ish.

When I sold the rights to the then-current products and designs the new owner wasn’t crazy enough to take on the meaningless company name and oddly spent domain so I kept it for my blog.

That blog was used to record progress on a couple of microcontroller projects I was working on at the time and got enough traffic that it was worthwhile putting google ads on the site.

As I was earning one or two cents a week in advertising revenue from the site I kept it online until May last year when I rolled my web servers over to Debian 8 which included PHP 5.6.x.

Unfortunately PHP 5.6 was a bridge to far for Etomite and my efforts were rewarded with a dreaded deprecation error which are quite often journey to no-where to unravel.

Depreated

Deprecated mysql library error in Etomite

I was already running the last version of Etomite released, Version 1.1, and as it was the only site out of a few dozen on that box that did not survive the operating system upgrade I went for the cop-out option and put up a cop-out home page and forgot about it.

cop-out

Cop-out offline message that lasted over a year.

Skip forward to this weekend and I decided to re-visit it as I’m back on the blogging kick again and I still get the odd email about broken links to one of the projects on the site for a CNC stepper motor controller design I posted to some forums.

Long story short, here’s how I fixed up my Etomite install so you can get your crusty old Etomite site working again as well and revel in the y2k feeling of the admin interface.

To silence the error handling, add a new line at the top of index.php:

<?php
error_reporting(E_ALL ^ E_DEPRECATED);
etc...

Then pop down to the executeParser() function and line 605 or there about’s and comment out the handler and reporting calls.

  function executeParser() {
    //error_reporting(0);
    //set_error_handler(array($this,"phpError"));

You should also comment out any other calls to error_reporting in index.php.  I had four of them but I think they were from my original half-hearted attempt to fix the deprecation error in 2015 but they may have been original.

Lastly put an ‘@’ in front of the deprecated mysql_connect statement on line 1323 or just after given you’ve added a new line at the top.

Change:

    if(@!$this->rs = mysql_connect($this->dbConfig['host'], $this->dbConfig['user'], $this->dbConfig['pass'])) {

to be:

    if(@!$this->rs = @mysql_connect($this->dbConfig['host'], $this->dbConfig['user'], $this->dbConfig['pass'])) {

And your Etomite will rise from the ashes, sorta, ish.

Now that you’ve got Etomite running again, shift the website to something else before you go much further.  There are some common open-source components in Etomite that have had long published exploits which could bite you in the proverbial bum if you leave it online.

At the very least make the entire site read-only to protect against the TinyMCE injection issues which surfaced after Etomite last received an update.  I’ve made mine read-only and have an IDS monitoring for file system changes but it is not what I’d call a ‘trusted’ site on the server and I’ll probably chroot it as well.

There was an attempt for a couple of years to get Etomite moving forwards called etofork on github but it seems to have died and if you want a similar CMS toolset MODX is the way to go now, or if your site was a blog you could go where everyone else seems to have and use WordPress.

For my part I’ll probably move the content to this site, as maintaining two blogs is kinda silly, but given my on-again, off-again blogging style that might have to wait another year or so. 🙂

The Nineties have been calling

For the last three or four years, possibly more there’s been this niggling little noise in my head every time I looked at my own blog.

It’s not that it’s the most important site I look after, and I really only set it up to test ideas and post the occasional rant but it turns out the 1990’s were indeed calling, asking for the website back.

So here we are, in 2014 and I’ve finally embraced HTML5 and CSS3 for my own site two years after the big ‘5’ became a candidate recommendation of the W3C, and in the year it is set to become the recommended standard for websites across the board.

Hang on, what do you mean? Isn’t HTML5 the standard? HTML 5

Amazingly although a large number of websites use HTML5 for their rendering now and it’s been a buzzword for at least five years it’s not actually a recommended standard yet. The W3C plan indicates that will happen this year. 1

This is the reason you hear website developers bemoaning the state of browser ‘X’ and device ‘Y’ rendering their latest creations. Or at least that’s why you heard those noises if you travel in circles frequented by web developers who like new toys.

So, off I went to themeforest and bought me a shiny responsive template, chopped up the source files and slapped it down on top of MODX without too much pain considering how long I put it off.

So far the result has been pleasing although I need to re-code the blog comments bits as they look horrible and there are some nasty kludges going on in the back room to get my old content to work in the new template.

Once I’ve fixed up the last couple of visual elements I suppose I’ll have to fix the validation of the old content as well but who really does that any more?

So here is it, my first post in the new template. It remains to be seen if it injects some enthusiasm so I start posting regularly again. Only time will tell.

  1. W3C 2014 plan

Responsive Design Best Practice

A wee while ago I wanted to create a new single-page landing site for one of my online properties. Just a logo, company name and contact details. Nothing more, nothing less.

Now, because I’m a really cool guy and I’m down with all the latest jargon and web stuff I decided it would be a responsive website.

Not so much responsive in the sense that I’ll ever respond to inquiries from there any more than I did when there was just a logo on the website and no details you understand!

This is responsive in the sense that all the hip crowd using mobile devices to access the site will get a nice experience and not have to scroll or zoom around to see the three lines of text on the site.

Responsive design is not a new idea and I’m certainly not going to claim to know much more than someone who looks up the term on wikipedia about the dark art of CSS3 @media rules.

In fact, I’d like to encourage everyone else to stop claiming they’re experts as well!

I thought about buying a single-page website template and slapping my info on it but because that would involve parting with money for something I can do myself I decided that roll-your-own was a better plan. And it’s just one page, right?

I figured someone would have a good guide on the rules for responsive design so I put ten cents in the Google roulette machine and crafted a search for responsive design best practice.

If you search for ‘best practice responsive design’ Google says there are about 7,440,000 results which took a grand total of 0.33 seconds to dig out of the dusty corners of the web.

Without reading them I’m guessing that there are probably about 1,488,000 unique and differing opinions to be had in those results about what in fact the best practice is.

And I’m being pretty generous there, allowing for one fifth of all the results to actually be something new and interesting.

Another problem I found was a number of what I thought were reputable sites quoting other similar well respected sites that had bugs in their @media statements and simply didn’t work on the small range of devices I had to test on.

So, the quoted best practice was actually pretty poor practice if you used an iPhone 4S, or Samsung Galaxy 3 which didn’t like the overlapping @media specs defined in some CSS which I think originally came from smashing magazine but so many people quote it that I have no idea where it originated!

So; I suppose seeing as I link baited with the title, you’re wondering what my best practice advice is for responsive design? Here goes:

Get off your adjustable office chair and learn how CSS works, understand what the @media max-width, min-width and pixel-ratio actually do and test it on a good sample of devices!

Random password generator update

Firstly a big thanks for the feedback I’ve received on the random password generator I stuck on the site a wee while ago, it’s had quite a bit of traffic so I’m going to assume it’s been of use to more than just myself!

I’ve fixed a minor bug where occasionally it would produce a password shorter than the length selected, which caused confusion for at least one person. To be honest I noticed it quite early on when I was testing and ignored it.

The second update is slightly more interesting. Grant from over the ditch in Australia pointed out that in the default setting of 9 characters with upper and lower case plus numbers there was often only one number in the password, where he felt there should be three on average.

And he would be correct, but I didn’t take the weighting of numbers vs letters when I wrote the generator. The problem being that there are 26 letters but last time I looked there were only ten numbers. The original code only used one instance of each number in the source string, so you were 2.6 times more likely to get a letter than a number, 5.2 times if you include upper and lower case.

I’ve fixed that up with a subtle update that uses 30 numeric characters in the source string, which gives relative likelihood of upper, lower and numbers of of 31.7%, 31.7% and 36.6%.

Along with that I’ve adjusted the punctuation string component to give more even distribution of punctuation if you select ‘Full Noise’.

Underscores vs Hyphens and an Apology

If you read my blog via an RSS reader you probably noticed at few odd goings on earlier today. I changed a few things on the site and all of the posts going back to last year appeared as new again, even if you’d read them.

Sorry ’bout that, but there was a method to my madness, or at least a method to my fiddling.

Although it’s not entirely obvious, one of the main reasons I started running this site was to mess around with search engine optimisation and try out the theories of various experts who also run a blog but with a great deal more focus that me.

To that end, I’ve re-written the code that generates my rss feed, and included some in line formatting to make it easier to read. Now when you read the blog from a feed reader it should look a bit more like the website, give or take. Well, more give than take.

While some of the changes were purely cosmetic, I also changed the URLs for all my blog posts.

The new URLs is the bit that caused them to pop up as new posts in at least feedburner and Google reader. The change to the URLs was to remove the dates from the URL itself, and replaced all the underscores with hyphens.

Removing the dates was because it just looked ugly compared to the WordPress style of using directories for the year / month. I don’t use WordPress, but decided that if I was going to mess with all my URLs I might as well change to nicer looking ones while I’m at it.

If you do some searching for “hyphen vs underscore in URLs” using your favourite search engine you’ll find a bunch of writing, with the general wisdom falling on the side of hyphens. In fact as far back as 2005 Matt Cutts, a developer from Google, blogged about it. [1]

So why might you ask did I use underscores? Well. Ummmmm, cause it’s what I’ve always done is the only answer I’ve got.

A bit more searching around told me that the results for at least Google are apparently different between the two methods. Using underscore caused URLs to be considered as phrases and hyphens were more likely to result in search results for individual words in the URL.

This sounded like something worthy of some experimentation so wearing my best white lab coat I created some pages on a few different sites I look after which were not linked to the navigation but were listed in the xml sitemaps.

I mixed and matched the URLs with underscores and hyphens and used some miss-spelt words and phrases. There were a total of 48 pages, spread over 8 domains, which were all visited by googlebot a number of time over an eight week period.

I had a split of twelve pages with hypens and matching content, twelve with hyphens and unmatched content, and the same split with underscores. Where the content matched I used the same miss-spelling of the words to get an idea of how well it worked. All six of the sites have good placement of long tail searches for their general content and get regularly spidered.

The end result is that most of the hyphenated URL pages that did not have matching keywords in content or tags were indexed against individual words in the URL (eight out of twelve). All of the pages that had hyphenated URLs and matching keywords in the content were indexed against those words.

The pages with underscores and non-matched content didn’t fair so well. Only four out of the twelve pages got indexed against words in the URL, although nine of them were indexed against long-tail phrases from the URLs. Pages with underscores and matching content ranked lower for keywords in the URL than the hyphenated ones although that’s not an accurate measure as they were miss-spelt words on pages with no back links.

So, end result: The common wisdom of using hyphens would appear to be valid and helpful if you’re running a site where long keyword rich URLs make sense, and the strength of the individual keywords might be more valuable than the phrase.

If you’re going for long tail search results in a saturated market where single keyword rank is hard to gain, you might want want to mix it up a little and try some underscores, it certainly can’t hurt to try it.

One thing to note for those not familiar with why this is even an issue. Spaces are not valid in the standard for URLs although they are common in poorly or lazily designed websites. If you’re really bored you can read the original spec by Tim Berners-Lee back in 1994 [2], or the updated version from 2005, also by Mr Berners-Lee. [3]

The long an short of that in this context is that you can use upper and lower case letters, numbers, hyphens, underscores, full stops and tildes (‘~’). Everything else is either reserved for a specific function, or not valid and requires encoding. A space should be encoded as ‘%20’ and you can probably imagine how well that looks when trying to%20read%20things.

If you type a URL into your browser with a space the browser is converting it to ‘%20’ before sending it down the pipe for you. You sometimes see these encoded URLs with no just spaces but other random things in them, and they can be the cause of random behaviour for some websites and software, so it’s best to avoid odd characters in your URLs.

Apologies again if you got some duplicates in your RSS reader over the last few hours. I’ll try not do that again, and it’ll be interesting to see if a couple of pages that were being ignored by Google with underscores get indexed now.

References:

Matt Cutts Blog posting from 2005 http://www.mattcutts.com/blog/dashes-vs-underscores/
1994 spec for URLs http://www.ietf.org/rfc/rfc1738.txt
2005 update to URL Spec http://www.ietf.org/rfc/rfc3986.txt

Javascript Compression with Apache 2 and Debian Etch

If you’re trying to wring every last drop of performance out of a website you’re probably wanting to compress all your content before it hits the wire. While I was messing about with another project I noticed that the javascript from this blog wasn’t getting compressed.

If you just want the solution to the issue, skip to the bottom of this post, but for those interested in the finer detail, read on.

This site uses Apache 2 on Etch, and after a bit of Googling I didn’t really find a direct mention of this issue, so I though I’d slap it on here for other folks afflicted with un-compressed javascript.

First step is to enable mod_deflate in the first place, which will by default compress html, xml, css and plain text files, but due to a config issue will not get your javascript.

The command to enable mod_deflate is: ‘a2enmod deflate’. If you’re on shared hosting without this configured you’ll have to drop your support folks an email, although I’d think most shared hosting companies would be well on top of the config of mod_deflate as it saves them money!

The reason javascript is not compressed by default is that the mime-type specified in /etc/apache2/mods-available/deflate.conf for javascript is ‘application/x-javascript’.

Apache dosn’t know what extension is associated with this mime type. The mime-types used by apache are defined in /etc/apache2/mods-available/mime.conf by including /etc/mine.types.

/etc/mime.types in turn has the .js extension associated with application/javascript, not x-javascript.

To fix this up you’ve got a some options:

  • Change /etc/mine.types to be application/x-javascript, which might break other applications that include that file.
  • Change /etc/apache2-mods-available/deflate.conf to use application/javascript
  • Add the mime type ‘application/x-javascript’ to /etc/apache2/mods-available/mime.conf with the line: ‘AddType application/x-javascript .js’ which is the option I took.

After adding the line do a /etc/init.d/apache reload and you’re in business, all .js files leaving your server will be compressed if the browser reports that it accepts compressed files.

You could of course pre-compress the files and serve them using .gz extensions, or control specific compression rules using a .htaccess file, but I wanted server-wide compression without needing to specifically configure sites on the server.

Google Location, the best of results, the worst of results

Google announced on their official blog a couple of days ago that location was the new black. Enhancing search results by allowing the surfer to rank results ‘nearby’, or pick another location by name.

This is just a continuation of the direction on-line technologies have been moving with social media leading the charge. Services like foursquare giving people their constant location fix. Twitter has even gone local allowing you to share your location in 140 character chunks.

Up until now the only real down side of this location hungry trend has been the exact same thing touted as the benefit of telling the world where you are. Namely that the world knows where you are. Privacy concerns are rife as the mobile social media crowd go about their daily lives in a virtual fish bowl.

pleaserobme.com highlights this by aggregating public location information from various social networks and figuring out if your house is empty. How long before insurance companies wise up and use Social media as a reason for not paying out on your house insurance? “But Mr Jones, you told the entire world you were away from your house, you encouraged the burglar.”

The last thing on earth I would want to do is share my location real time with the world but I was keen to experience the Google location search to see how it actually effects search results.

The impact of location based search is going to be far more noticeable in the real world than the failed insurance claims of some iPod users.

The Google blog entry says that this is available to English google.com users, but we don’t have it here in New Zealand yet. We might have been first to see the new millennium, but not so much with Google changes.

To get my Google location fix I used a secure proxy based in the US and took in the view or the world from Colorado. Pretending to be within the 48 States is handy for all sorts of things.

LocationI did some searches from a clean browser install on a fresh virtual machine, so that personal search preferences or history would not taint the results. I then set about testing some long-tail search phrases that give top 5 results consistently for our website at work.

No surprise that I got essentially the same results as I do here in New Zealand, but with more ads due to targeted adwords detecting that I was in the US of A. What was disturbing was that selecting ‘nearby’ knocked our search result down past the tenth page of Google.

We sell products to the whole world, and do not have a geographical target so the location search will clearly have an impact on our organic results as it rolls out. A business which is targeting a local area such as a coffee shop or Restaurant might well benefit from the location search, assuming that Google knows where your website is.

But there’s the rub. How did Google decide our website was not near Colorado? Our webserver lives in Dallas TX, our offices are in New Zealand and Thailand, and we regularly sell products to over thirty countries.

Which leads to the impact of location for web developers and the SEO community. How do you tell Google what your ‘Local’ is? I messed about with location names, and putting in ‘Christchurch’ where our business is based got our long tail hit back up to the front page, but only a fraction of our business comes from Christchurch, dispite it being where our head office is.

I suppose anti-globalisation campaigners in their hemp shirts and sandals will be rejoicing at this news but I’m not so sure I’m going to be celebrating this development with the same enthusiasm.
A quick search for meta-tags or other methods of identifying your geographical target came up dry, and even if there was one we can only gently suggest to Google that it index and present things the way we as web site owners want.

When the dust has settled and the ‘Nearby’ link is clicked Google are the only ones who know what the best results are. It just might be that their best just became your worst if your business has a broad geographical target and weak organic placement.

Security in the cloud, KISS

The idea of keeping things simple when it comes to server security is not at all radical and cloud servers provide the ability to reach the not so lofty goal of keeping your servers simple and secure without breaking the bank.

The theory is simple: The smaller the number of processes you have running on your box the less there is to go wrong, or attack. This is one area where Windows based servers are immediately at a disadvantage over a *ix server, but I digress.

When I was pretending to be a hosting provider a few years ago I ran colocated discrete servers. They weren’t cheap to own or run, not by a long shot. That cost was a huge enemy of the KISS security concept.

In the process of trying to squeeze every last cent of value from the boxes I overloaded them with every obscure daemon and process I could think of. Subsequently the configuration of the servers became complex and difficult to manage, while applying patches became a cause of sleepless nights and caffeine abuse.

With the cost to deliver a virtual server in the cents per hour and the ability to build a new server in a matter of minutes the barrier to building complex applications with a robust security architecture is all but vanished.

The mySQL server behind this blog site is a base install of Debian Lenny with mySQL, nullmailer, knockd and an iptables firewall script. That’s it. Simple to build, simple to configure, simple to backup and simple to manage. KISS.

A little bit of searching around on hardening up a linux box and you’ll quickly find information on changing default settings for sshd and iptables rulesets which you can combine with small targeted cloud servers to reduce the sleepless nights.

I can’t help with the coffee addiction though, I’m still trying to kick that habit myself!

Working on a cloud

This blog is now coming to you from a cloud. A rackspace cloud server that is. Two of them in fact, the front end server running the CMS, and the back-end MySQL server.

The concept of cloud computing really isn’t all that new, but if you’re all at sea when it comes to clouds you might want to toodle over to Wikipedia and read about it there.

“This is the pointy end of the geek scale where crontabs are complex and the preferred editors have two letter names.”
The service I’m using is probably better described as cloud provisioning, in that I’ve got two virtual servers living somewhere in the bowels of the Rackspace data centre. I don’t have to care about memory sizing, disk space, network infrastructure, or anything else for that matter, I’m just renting some resources out of the cloud.

I picked how much memory and disk space I wanted in a few clicks then before the kettle had time to boil the server was on line and ready for configuration. If this service was available back when I was running a hosting business I’d probably still be running a hosting business, although I’d also be stark raving bonkers.

At this point I should say that I’m talking about virtual Linux servers here, not cloud hosting or full service shared hosting. This is the pointy end of the geek scale where crontabs are complex and the preferred editors have two letter names.

I’ve moved the blog onto the fluffy stuff to get a feeling for the service before I shift my work-in-progress link shrinker into the cloud as well. What I want to achieve with the lngz.org is simply not possible on a shared platform, as I want to build a tiered application which can scale quickly.

The traditional way of achieving this goal would be to slap your gold card down on the counter of a hosting company and then proceed to the bank to arrange a second mortgage on your house. Virtualised ‘cloud’ server services such as rackspace cloud, Amazon EC2 or gogrid lets you do the same things for a fraction of the cost and with amazing flexibility.

note: I’m not affiliated with Rackspace, I just think they provide a nifty service. 🙂