How do I read / download webpage content?

You want to create a program that read a webpage content of a website page. The example below use the URL class to create a connection to the website. You create a new URL object and pass the url information of a page. After the object created you can open a stream connection using the openStream() method of the URL object.

Next, you can read the stream using the BufferedReader object. This reader allows you to read line by line from the stream. To write it to a file create a writer using the BufferedWriter object and specify the file name where the downloaded page will be stored.

When all the content are read from the stream and stored in a file close the BufferedReader object and the BufferedWriter object at the end of your program.

package org.kodejava.example.net;

import java.io.*;
import java.net.MalformedURLException;
import java.net.URL;

public class UrlReadPageDemo {
    public static void main(String[] args) {
        try {
            URL url = new URL("http://www.kodejava.org");

            BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()));
            BufferedWriter writer = new BufferedWriter(new FileWriter("data.html"));

            String line;
            while ((line = reader.readLine()) != null) {
                System.out.println(line);
                writer.write(line);
                writer.newLine();
            }

            reader.close();
            writer.close();
        } catch (MalformedURLException e) {
            e.printStackTrace();
        }  catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Wayan

Programmer, runner, recreational diver, live in the island of Bali, Indonesia. Mostly programming in Java, creating web based application with Spring Framework, Hibernate / JPA. Support me by donating >> here <<.

Leave a Reply