If you cut a html-formatted string at some random position (e.g. with my truncate()
function) you might mess up the html. To circumvent that, this function will close all open tags at the end of the string:
The original function was written by connum at DONOTSPAMME dot googlemail dot com
/**
* close all open xhtml tags at the end of the string
*
* @param string $html
* @return string
* @author Milian Wolff <mail@milianw.de>
*/
function closetags($html) {
#put all opened tags into an array
preg_match_all('#<([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $html, $result);
$openedtags = $result[1];
#put all closed tags into an array
preg_match_all('#</([a-z]+)>#iU', $html, $result);
$closedtags = $result[1];
$len_opened = count($openedtags);
# all tags are closed
if (count($closedtags) == $len_opened) {
return $html;
}
$openedtags = array_reverse($openedtags);
# close tags
for ($i=0; $i < $len_opened; $i++) {
if (!in_array($openedtags[$i], $closedtags)){
$html .= '</'.$openedtags[$i].'>';
} else {
unset($closedtags[array_search($openedtags[$i], $closedtags)]);
}
}
return $html;
}
continue reading...
If you use UTF-8 in your PHP projects you may want to use [wordwrap](http://www.php.net/wordwrap)()
. But that function can’t handle multibyte characters and may mess up your text.
Don’t be annoyed - help is near!
The only PHP UTF-8 wordwrap function I found was the one by tjomi4 at yeap dot lv in the notes of the PHP manual. I took it and improved it a bit:
- completly the same syntax as the original wordwrap function:
string utf8_wordwrap(string $str, integer $width, string $break [, bool $cut]);
The $cut
parameter is supported (tjomi4’s function only supports $cut = true
).
But be careful : I use regular expression word boundaries (\b
) for this feature. I’m not sure if this works everywhere!
- The function uses the multibyte extension if installed for counting the string length
- The regular expression inside the while loop is shorter and uses
[preg_match](http://www.php.net/preg_match)()
instead of [preg_replace](http://www.php.net/preg_replace)()
. That should improve performance and prevent a strange bug (Compilation failed: regular expression too large
)
continue reading...
And here another syntax file for Nano. This time it highlights the /etc/apt/sources.list
:
## syntax highlighting for /etc/apt/sources.list
syntax "apt/sources.list" "sources\.list(\.old|~)?$"
# component
color brightmagenta "^deb(-src)? ((http|file|ftp):/[^ ]+|cdrom:\[[^\]]+\]/|cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/) [^ ]+ .+$"
# distribution
color brightred "^deb(-src)? ((http|file|ftp):/[^ ]+|cdrom:\[[^\]]+\]/|cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/) [^ ]+"
# URI
color brightgreen "(http|file|ftp):/[^ ]+"
# cdroms
# [^\]] does not work…
color brightgreen "cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/"
# deb / deb-src
color cyan "^deb"
color brightblue "^deb-src"
# comments
color brightyellow "#.*"
continue reading...
If you write a CMS you will have to truncate your contents to automagically create summaries. The following function will do the job:
If you write a given string (default '<!--MORE-->'
) inside one of your contents, the text will be truncated to that position.
Else the function looks for the nearest word boundary after at least $len
characters and cuts there. Because that might be directly inside your text $append
will be appended. To prevent that the markup is messed up, closetags()
is called.
/**
* if $splitter is found inside the $str, everything before $splitter will be
* returned.
* else truncates $str after $length chars (actually at the nearest
* word boundary after at least $len characters). Also $append will be added to
* $str
*
* @param string &$str
* @param integer $length
* @param optional string $hardbreak
* @param optional string $append
* @return string
* @author Milian Wolff <mail@milianw.de>
*/
function truncate($str, $len = 200, $splitter = '<!--MORE-->', $append = '…') {
if (strlen($str) <= $len) {
return $str;
}
if ($len > 0 && !strstr($str, $splitter)) {
preg_match('#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){'.$len.',}\b#U', $str, $matches);
$str = $matches[0];
# remove trailing opener tags and close all other open tags:
$str = closetags(preg_replace('#\s*<[^>]+>?\s*$#', '', $str).$append);
} else {
$arr = explode($splitter, $str, 2);
$str = $arr[0];
}
return $str;
}
continue reading...