Your Ad Here

Thursday, April 16, 2009

Advanced text parsing

Ok, here's my text parsing function as it exists right now:


Code:
---------
function bbc2html($data) {
// Emoticons
$data=str_replace(":)","",$data);
$data=str_replace(":d","",$data);
$data=str_replace(":D","",$data);
$data=str_replace(":->:","",$data);
$data=str_replace(":arrow:","",$data);
$data=str_replace(":s","",$data);
$data=str_replace(":S","",$data);
$data=str_replace("8)","",$data);
$data=str_replace("8-)","",$data);
$data=str_replace(":cool:","",$data);
$data=str_replace(":'(","",$data);
$data=str_replace("8|","",$data);
$data=str_replace("(6)","",$data);
$data=str_replace(":evil:","",$data);
$data=str_replace(":!:","",$data);
$data=str_replace(":lol:","",$data);
$data=str_replace(":@","",$data);
$data=str_replace(":x","",$data);
$data=str_replace(":X","",$data);
$data=str_replace(":mad:","",$data);
$data=str_replace(":mrgreen:","",$data);
$data=str_replace(":|","",$data);
$data=str_replace(":?:","",$data);
$data=str_replace(":$","",$data);
$data=str_replace(":oops:","",$data);
$data=str_replace(":redface:","",$data);
$data=str_replace(":rolleyes:","",$data);
$data=str_replace(":roll:","",$data);
$data=str_replace(":("," $data=str_replace(":-(","",$data);
$data=str_replace(":-)","",$data);
$data=str_replace(":o","",$data);
$data=str_replace(":O","",$data);
$data=str_replace(":twisted:","",$data);
$data=str_replace(";)","",$data);
$data=str_replace(";-)","",$data);

// HTML Conversion
$data=str_replace("[s]","",$data);
$data=str_replace("[/s]","
",$data);
$data=str_replace("*","",$data);
$data=str_replace("*","
",$data);
$data=str_replace("","",$data);
$data=str_replace("","
",$data);
$data=str_replace("_","",$data);
$data=str_replace("_","
",$data);
$data=str_replace("","
",$data);
$data=str_replace("
","
",$data);
$data=str_replace("1/2","�",$data);
$data=str_replace("3/4","�",$data);
$data=str_replace("1/4","�",$data);
$data=str_replace("[sup]","",$data);
$data=str_replace("[/sup]","
",$data);
$data=str_replace("[sub]","",$data);
$data=str_replace("[/sub]","
",$data);
$data=str_replace("
* ","
    ",$data);
    $data=str_replace("
    * ","
  • ",$data);
    $data=str_replace("

    ","
",$data);
$data=str_replace("[#list]","
    ",$data);
    $data=str_replace("[/#list]","
",$data);
// $data=str_replace("
---Quote---
","
Quote...
",$data);
// $data=str_replace("
---End Quote---
","
",$data);
$data=str_replace("
Code:
---------
","",$data);
$data=str_replace("
---------
","
",$data);
$data=str_replace("[slide]","",$data);
$data=str_replace("[/slide]","",$data);
$data=str_replace("[hr]","
",$data);
$data=str_replace("[hr=","
$data=str_replace("[l]","<",$data);
$data=str_replace("[r]",">",$data);
$data=str_replace("file://","",$data);
$data=str_replace("ftp://","",$data);

return $data;
}
---------
Quoting is blocked off because it was acting weird, but that's a subject for another day. The website I'm working on needs internal links between several sections. I can get basic results by doing something similar to the following:


Code:
---------
$data = [[sample]];

$data = str_replace("[[","hello!",$data);
---------
And it would produce a result akin to hello! (http://./wordinfo.php?id=sample). The trouble is, I'd like to do a bit more advanced string manipulation so that I can use the segment in the brackets to give a title to the link. I experimented last night with using strstr() to see if there were any links and explode() to chunk things up a bit, but it didn't quite work and got scrapped. The code for that was akin to the following:


Code:
---------
if (strstr($data, "[[")) {
$tempdata = explode('[[', $data, 2); // limiting it to two chunks: before the link and the stuff after it

if ($tempdata[0] != '[[') { // finding the end of the link
$tempdata2 = explode (']]', $tempdata[0], 2);
$link = "$tempdata2[0]";

$data=str_replace("[[$tempdata[0]]]","$link",$data);
} else {
$tempdata2 = explode ("]]", $tempdata[1], 3);
$link = "$tempdata2[0]";
$data=str_replace("[[$tempdata[1]]]","$link",$data);
}
}
---------
Seems like it should work, but all it's returning is the plain text of the link. Any ideas on better approaches or how to fix what I've got going right now?

Read More...
Your Ad Here