TagCloud in PHP
Tag Clouds are used in many websites, but most of the time in bad ways. Even though I am not the largest fan of them, I will teach you how to generate one using PHP in three different methods. I mostly am doing this because I needed a tag cloud to quickly show me the most used search terms to reach my websites, and I figured tag clouds do this fairly well.
This isn’t really a tutorial, but an example on how to code a tag cloud. Below is the full code I created (plus a few alias functions). Under the code I will explain how things work if you do wish to know more about it.
<?php
/*
tagcloud_wordarray(
array(
array('word one',1),
array('word two',1),
array('word 3',3),...
),[min font size,[max font size]]
);
*/
function tagcloud($data,$minsize=12,$maxsize=32) {
$highestval = 0;
$lowestval = false;
$numinc = 0;
$output = '';
$s = 0;
$items = count($data);
for($i = 0; $i < $items; $i++) {
if($data[$i][1] > $highestval) {
$highestval = $data[$i][1];
}
if($data[$i][1] < $lowestval || $lowestval === false) {
$lowestval = $data[$i][1];
}
}
$numinc = ($highestval - $lowestval);
$sizedif = ($maxsize-$minsize);
for($i = 0; $i < $items; $i++) {
$s = $data[$i][1] - $lowestval;
$s = $s / $numinc;
$s = $s * $sizedif;
$s = $s + $minsize;
$output .= '<span style="font-size:'.$s.'px">'.$data[$i][0].'</span> ';
}
return $output;
}
/* tagcloud_wordarray(array('a','b','c',...),[min font size,[max font size]]) */
function tagcloud_wordarray($words,$minsize=12,$maxsize=32) {
$array_counts = array_count_values($words);
$tagarray = array();
foreach($array_counts as $k=>$v) {
$tagarray[] = array($k,$v);
}
return tagcloud($tagarray,$minsize,$maxsize);
}
/* tagcloud_string("a b c ...",[min font size,[max font size]]) */
function tagcloud_string($words,$minsize=12,$maxsize=32) {
$words = strtolower($words);
$words = str_replace(array('.',',','"'),'',$words);
$words = strip_tags($words);
$words = explode(' ',$words);
return tagcloud_wordarray($words,$minsize,$maxsize);
}
?>
In the functions tagcloud_wordarray
and tagcloud_string
, we accept
different formats of the data to generate the tagcloud using the
tagcloud function. We do still require the minsize and maxsize for the
font settings, and we just pass those along to the final function.
In tagcloud_string
, we have to strip the tag of punctuation. Having
the word “lorem.” and “lorem” should both be considered the same. This
is accomplished by running str_replace. After that, assuming some users
may have HTML in their strings, we strip that out using the PHP
function strip_tags. Now that the paragraph or string of words is
cleaned up, we can split it into an array by using explode and the
delimiter of a space. Last off, we return the value of the function
tagcloud_wordarray
with the new word array we just generated.
The function tagcloud_wordarray
takes an array of words, counts the
instances of each word and passes those values onto tagcloud for the
final calculations. The tagcloud_wordarray
function doesn’t format or
parse the values for grammar or tags, since if the words are already in
an array, we figure it’s already properly formatted, including the case
of the words. The first function call is array_count_values
, thus
counting the recurrence of each word. After that, we have to turn the
associative array into a numeric array. The key and value are grabbed
from a foreach statement inserting the data into the final array of
tagarray
. Once that’s done, we call to return tagcloud with our new
array of data and just pass through the minsize and maxsize for fonts.
The final and most important function is now tagarray
. This was the
original function I coded for my needs. The data argument is a numeric
array that each value has another array of the word and the appearance
rate (or just word count) of that word in value 0. Below is an example
of the data layout required.
Array(
[0] => Array
(
[0] => lorem
[1] => 3
)
[1] => Array
(
[0] => ipsum
[1] => 8
)
[2] => Array
(
[0] => foo
[1] => 2
)
[3] => Array
(
[0] => bar
[1] => 4
)
)
We loop through each word getting the numeric value assigned to it. We check if it’s higher than our highest value, or lower than the lowest value. If either of those are true, said value is updated. This is going to be used in the future equation.
Next we calculate the difference between the highest occurrence and lowest, along with calculating the difference between font sizes we passed in the function arguments. We loop through the array once again. This time we are on the mission of calculating the font size and adding the string to the output variable by appending it. The size is calculated using the equation below.
((((x - j) / k) * d) + m)
x = occurrence of word
j = lowest value
k = difference between highest and lowest value
d = difference between font size values
m = minimum font size
We then finish up the code by returning the output variable with all the strings append to it.