Escape HTML Specials
From CodeCodex
In HTML, “&” is special because it is used to start entity references. “<” is special because it starts tags. Unpaired “>” is not special, but is escaped just to be safe. The HTML 4 spec seems to indicate that either “'” or “"” may be used to quote attribute values, but in practice only “"” seems to be used. Is this right?
[edit] C++
a quick and dirty port from the js version:
string EscapeHTML(string & Str)
/* returns Str with all characters with special HTML meanings converted to
entity references. */
{
string Escaped="";
for (int i = 0; i < Str.size(); ++i)
{
string ThisCh = Str.substr(i,1);
if (ThisCh == "&")
ThisCh = "&";
else if (ThisCh == "<")
ThisCh = "<";
else if (ThisCh == "\"")
ThisCh = """;
else if (ThisCh == ">")
ThisCh = ">";
Escaped += ThisCh;
} /*for*/
return Escaped;
} /*EscapeHTML*/
[edit] JavaScript
Surprisingly, there is no built-in JavaScript function for doing this.
function EscapeHTML(Str)
/* returns Str with all characters with special HTML meanings converted to
entity references. */
{
var Escaped = ""
for (var i = 0; i < Str.length; ++i)
{
var ThisCh = Str.charAt(i)
if (ThisCh == "&")
{
ThisCh = "&"
}
else if (ThisCh == "<")
{
ThisCh = "<"
}
else if (ThisCh == "\"")
{
ThisCh = """
}
else if (ThisCh == ">")
{
ThisCh = ">"
} /*if*/
Escaped += ThisCh
} /*for*/
return Escaped
} /*EscapeHTML*/
You can however also use Apache Commons library org.apache.commons.lang.StringEscapeUtils.
String test = StringEscapeUtils.escapeHtml("\"bread\" & \"butter\"");
System.out.println(test);
In the above example, the output will be:
"bread" & "butter"
[edit] Perl
use HTML::Entities qw(encode_entities);
encode_entities $s, q{<>&"};

