Drowned World

Same kind of moon, same kind of jungle.

htmlentities and character encoding

I recently came across a bug in some code I’d written where the input filters were doing something strange on certain peoples computers and truncating the inputted text when hitting a single quote. After a bit of googling I realised that the htmlentities filter I was using wasn’t set to the right character encoding. Everything on the site was running with UTF-8 but the htmlentities default encoding is ISO-8859-1. I don’t know if this will be updated to UTF-8 in future versions of PHP but if you’re encoding in UTF-8, you’ll need to set your input filters to process in this encoding.

Standard PHP

$clean = htmlentities($input, ENT_QUOTES, 'UTF-8');

Zend Framework

$encoding = array('quotestyle' => ENT_QUOTES, 'charset' => 'UTF-8');
$f = new Zend_Filter();
$f->addFilter(new Zend_Filter_HtmlEntities($encoding));
$clean = $f->filter($input);

2 ResponsesLeave one →

  1. you have just saved my :) and a total nerve collapse:)

    thanks

  1. problem with Zend_Filter_HtmlEntities - Zend Framework Forum

Leave a Reply