PHP: How to download a webpage (aka web scrapping) with PHP PHP:如何下载网页(又名Web报废与PHP)
Posted on 03.发布03。 Oct, 2009 by Dragos in Coding , PHP 2008年10月,2009年在编码 ,PHP的 德拉戈什
There are many ways of downloading web pages, or web content.还有的下载网页,或网页内容的许多方面。 Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents .我个人喜欢用,为我的报废需要卷毛 ,但有时候,我也可以使用fsockopen和 file_get_contents。
Here are 3 different functions that will allow you to download web content.这里有3种不同的功能,让您下载网络内容。
cURL : 卷曲 :
function getData($url) {函数getData($网址)( if($url!='localhost' && $url!='http://localhost') {如果($网址:='本地主机'&&$网址:='http://localhost')( $ch=curl_init(); $通道= curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($通道,CURLOPT_URL,$网址); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($通道,CURLOPT_RETURNTRANSFER,真); curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt($通道,CURLOPT_USERAGENT,“Mozilla/6.0(视窗; ü;的Windows NT 5.1;恩美,房车:1.9.0.1)Gecko/2008070208 Firefox/3.0.3”); curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt($通道,CURLOPT_FOLLOWLOCATION,3); $result['data']=curl_exec($ch); $结果['数据'] = curl_exec($通道); $result['error']=curl_error($ch); $结果['错误'] = curl_error($通道); curl_close($ch); curl_close($通道); return $result;返回$结果; } ) else return $result['error']='err';否则返回$结果['错误'] ='错误'; } )
fsockopen fsockopen
function getData($url) {函数getData($网址)( $arr=parse_url($url); $到达= parse_url($网址); $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); $计划生育= fsockopen($到达['主机'],80,$访问Errno,$ errstr,30); if(!$fp) {如果($ FP)的( return false;返回false; }else {否则() // send headers / /发送头 $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $出=“得到”。fsockopen($到达['主机'],80,$访问Errno,$ errstr,30)。“HTTP/1.1 \ṛ\ N”的; $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $出.=“主持人:”。str_replace('http://'。$到达['主机'],'',$网址)。“\ṛ\ N”的; $out .= "User-Agent: FSOCKOPEN\r\n"; $出.=“用户代理:FSOCKOPEN \ṛ\ N”的; $out .= "Connection: Close\r\n\r\n"; $出.=“连接:关闭\ṛ\ ñ \ṛ\ N”的; fwrite($fp, $out); fwrite($计划生育,$出); while(!feof($fp)) {而(!feof($计划生育))( $contents .= fgets($fp, 4096); $内容.= fgets($计划生育,4096); }; ); fclose($fp); fclose($ FP)的; return $contents;返回$内容; } ) } )
file_get_contents file_get_contents
function getData($url) {函数getData($网址)( return file_get_contents($url);返回file_get_contents($网址); } )
As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you.正如您所看到的网页内容下载最简单的方法是使用file_get_contents函数,但是如果你需要更多的选择,特别是如果你的头的工作,然后cURL是最好的方法去你。
Related posts:相关岗位:
- ferry ardhana 渡轮阿尔达纳












































