htmlentities and character encoding

I recently came across a bug in some code I’d written where the input filters were doing something strange on certain peoples computers and truncating the inputted text when hitting a single quote. After a bit of googling I realised that the htmlentities filter I was using wasn’t set to the right character encoding. Everything on the site was running with UTF-8 but the htmlentities default encoding is ISO-8859-1. I don’t know if this will be updated to UTF-8 in future versions of PHP but if you’re encoding in UTF-8, you’ll need to set your input filters to process in this encoding.

Standard PHP

$clean = htmlentities($input, ENT_QUOTES, 'UTF-8');

Zend Framework

$encoding = array('quotestyle' => ENT_QUOTES, 'charset' => 'UTF-8');
$f = new Zend_Filter();
$f->addFilter(new Zend_Filter_HtmlEntities($encoding));
$clean = $f->filter($input);
This entry was posted in PHP, Zend Framework. Bookmark the permalink.

2 Responses to htmlentities and character encoding

  1. Pingback: problem with Zend_Filter_HtmlEntities - Zend Framework Forum

  2. ron says:

    you have just saved my :) and a total nerve collapse:)

    thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>