Difference between revisions of "Retrieve the text of a URL"

From CodeCodex

(Implementations)
(Implementations)
 
Line 2: Line 2:
 
===Java===
 
===Java===
 
This simple class downloads html from a URL
 
This simple class downloads html from a URL
<pre>
+
<pre class="java">
 
import java.net.URL;
 
import java.net.URL;
 
import java.net.HttpURLConnection;
 
import java.net.HttpURLConnection;
Line 52: Line 52:
  
 
===Perl===
 
===Perl===
<HIGHLIGHTSYNTAX language="perl">
+
<pre class="perl">
 
use LWP::Simple qw(get);
 
use LWP::Simple qw(get);
 
my $content = get 'http://example.com';
 
my $content = get 'http://example.com';
</HIGHLIGHTSYNTAX>
+
</pre>
  
 
===Python===
 
===Python===
Line 64: Line 64:
  
 
===PHP===
 
===PHP===
<HIGHLIGHTSYNTAX language="php">
+
<pre class="php">
 
$text = file_get_contents('http://example.com');
 
$text = file_get_contents('http://example.com');
</HIGHLIGHTSYNTAX>
+
</pre>
  
 
=== Tcl ===
 
=== Tcl ===
<HIGHLIGHTSYNTAX language="tcl">
+
<pre class="tcl">
 
package require http
 
package require http
 
puts [http::data [http::geturl http://www.codecodex.com/wiki/Main_Page]]
 
puts [http::data [http::geturl http://www.codecodex.com/wiki/Main_Page]]
</HIGHLIGHTSYNTAX>
+
</pre>
  
 
[[Category:Java]]
 
[[Category:Java]]

Latest revision as of 18:43, 15 February 2011

Implementations[edit]

Java[edit]

This simple class downloads html from a URL

import java.net.URL;
import java.net.HttpURLConnection;
import java.io.IOException;
import java.io.IOException;
import java.io.BufferedInputStream;
import java.io.InputStreamReader;
import java.io.Reader;

/**
 * Represents a webpage
 * @author Julius Schorzman
 * (c)2005 - provided as GPL
 */
public class Webpage {
	
	private StringBuilder html = new StringBuilder();
	
	/**
	 * Downloads html from a webpage
	 * @param url
	 * @throws IOException
	 */
	public Webpage(URL url) throws IOException {
	    
	    HttpURLConnection c = (HttpURLConnection)url.openConnection();
	    BufferedInputStream in = new BufferedInputStream(c.getInputStream());
	    Reader r = new InputStreamReader(in);	

	    int i;
	    while ((i = r.read()) != -1) {
	    	html.append((char) i);
	    }
	    
	    html.trimToSize();
	}
	
	/**
	 * Returns the html of this page as a String.
	 * @return The html
	 */
	public String getHtml() {
		return html.substring(0);
	}
}

OCaml[edit]

Http_client.Convenience.http_get "http://..."

Perl[edit]

use LWP::Simple qw(get);
my $content = get 'http://example.com';

Python[edit]

 from urllib2 import urlopen
 print urlopen('http://example.com').read()

PHP[edit]

$text = file_get_contents('http://example.com');

Tcl[edit]

package require http
puts [http::data [http::geturl http://www.codecodex.com/wiki/Main_Page]]