Difference between revisions of "Remove non-letters from a string"

From CodeCodex

m (Reverted edit of 218.58.136.4, changed back to last version by Nostromo)
Line 4: Line 4:
 
This code removes common symbols and numbers from a string, returning letters only.  (Note, not all symbols are included.)
 
This code removes common symbols and numbers from a string, returning letters only.  (Note, not all symbols are included.)
 
  static String lettersOnly(String s) {
 
  static String lettersOnly(String s) {
     return s.replaceAll("[0-9]|[ !@#\\$%\\^
+
     return s.replaceAll("[0-9]|[ !@#\\$%\\^&\\*\\(\\)_\\+\\-={}\\|:\"<>\\?\\-=\\[\\];',\\./`~'£€¥]","");
 +
}
 +
 
 +
===OCaml===
 +
# let remove_nonalpha = Str.global_replace (Str.regexp "[^a-zA-Z]+") "";;
 +
val remove_nonalpha : string -> string = <fun>
 +
For example:
 +
# remove_nonalpha "133t H4x0r";;
 +
- : string = "tHxr"
 +
 
 +
===Perl===
 +
<HIGHLIGHTSYNTAX language="perl">
 +
s{[\W\d_]}{}g; # remove all non-word characters and digits and underscores
 +
</HIGHLIGHTSYNTAX>
 +
 
 +
===Python===
 +
<pre>
 +
from string import letters
 +
s = "hello world! how are you? 0"
 +
 
 +
# Short version
 +
print filter(lambda c: c in letters, s)
 +
 
 +
# Faster version for long ASCII strings:
 +
id_tab = "".join(map(chr, xrange(256)))
 +
tostrip = "".join(c for c in id_tab if c not in letters)
 +
print s.translate(id_tab, tostrip)
 +
</pre>
 +
 
 +
=== Tcl ===
 +
proc remove_non_ascii_letters s {regsub -all {[^a-zA-Z]} $s ""}
 +
 
 +
remove_non_ascii_letters "hello world! how are you? 0é" ;# -> helloworldhowareyou
 +
 
 +
[[Category:Java]]
 +
[[Category:OCaml]]
 +
[[Category:Perl]]
 +
[[Category:Python]]
 +
[[Category:Tcl]]
 +
[[Category:String]]

Revision as of 02:50, 3 September 2007

Related content:

Implementations

Java

This code removes common symbols and numbers from a string, returning letters only. (Note, not all symbols are included.)

static String lettersOnly(String s) {
   return s.replaceAll("[0-9]|[ !@#\\$%\\^&\\*\\(\\)_\\+\\-={}\\|:\"<>\\?\\-=\\[\\];',\\./`~'£€¥]","");
}

OCaml

# let remove_nonalpha = Str.global_replace (Str.regexp "[^a-zA-Z]+") "";;
val remove_nonalpha : string -> string = <fun>

For example:

# remove_nonalpha "133t H4x0r";;
- : string = "tHxr"

Perl

<HIGHLIGHTSYNTAX language="perl">

s{[\W\d_]}{}g; # remove all non-word characters and digits and underscores

</HIGHLIGHTSYNTAX>

Python

from string import letters
s = "hello world! how are you? 0"

# Short version
print filter(lambda c: c in letters, s)

# Faster version for long ASCII strings:
id_tab = "".join(map(chr, xrange(256)))
tostrip = "".join(c for c in id_tab if c not in letters)
print s.translate(id_tab, tostrip)

Tcl

proc remove_non_ascii_letters s {regsub -all {[^a-zA-Z]} $s ""}
remove_non_ascii_letters "hello world! how are you? 0é" ;# -> helloworldhowareyou