以下实例演示了如何使用 net.URL 类的 URL() 构造函数来抓取网页:
import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileWriter; import java.io.InputStreamReader; import java.net.URL; public class Main { public static void main(String[] args) throws Exception { BufferedReader reader = new BufferedReader ( new InputStreamReader(url.openStream())); BufferedWriter writer = new BufferedWriter ( new FileWriter( "data.html" )); String line; while ((line = reader.readLine()) != null ) { System.out.println(line); writer.write(line); writer.newLine(); } reader.close(); writer.close(); } } |
以上代码运行输出结果为(网页的源代码,存储在当前目录下的 data.html 文件中):
<!DOCTYPE html> < html > < head > < meta charset = "UTF-8" /> < meta http-equiv = "X-UA-Compatible" content = "IE=11,IE=10,IE=9,IE=8" />…… |