PHP: How to download a webpage (aka web scrapping) with PHP PHP: Hvordan at hente en webside (alias web ophugning) med PHP
Posted on 03. Sendt den 03. Oct, 2009 by Dragos in Coding , PHP Oktober, 2009 af Dragos i Kodning, PHP
There are many ways of downloading web pages, or web content. Der er mange måder at hente websider, eller web-indhold. Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents . Personligt er jeg gerne bruge webadressen til min web ophugning behov, men nogle gange har jeg også bruge fsockopen og file_get_contents.
Here are 3 different functions that will allow you to download web content. Her er 3 forskellige funktioner, der giver dig mulighed for at hente webindhold.
cURL : webadressen:
function getData($url) { function getData ($ url) ( if($url!='localhost' && $url!='http://localhost') { if ($ url! = 'localhost' & & $ url! = 'http://localhost') ( $ch=curl_init(); $ ch = curl_init (); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ ch, CURLOPT_URL, $ url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/6.0 (compatible; Windows NT 5.1; en-US; rv: 1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 3); $result['data']=curl_exec($ch); $ result [ 'data'] = curl_exec ($ ch); $result['error']=curl_error($ch); $ result [ 'error'] = curl_error ($ ch); curl_close($ch); curl_close ($ ch); return $result; return $ result; } ) else return $result['error']='err'; else return $ result [ 'error'] = 'err'; } )
fsockopen fsockopen
function getData($url) { function getData ($ url) ( $arr=parse_url($url); $ arr = parse_url ($ url); $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); $ fp = fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30); if(!$fp) { if (! $ fp) ( return false; return false; }else { ) else ( // send headers / / Send headers $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $ out = "GET". fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30). "HTTP/1.1 \ r \ n"; $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $ out .= "Host:". str_replace ( 'http://'. $ arr [ 'host'],'',$ url). "\ r \ n"; $out .= "User-Agent: FSOCKOPEN\r\n"; $ out .= "User-Agent: fsockopen \ r \ n"; $out .= "Connection: Close\r\n\r\n"; $ out .= "Connection: Close \ r \ n \ r \ n"; fwrite($fp, $out); fwrite ($ fp, $ out); while(!feof($fp)) { while (! feof ($ fp)) ( $contents .= fgets($fp, 4096); $ contents .= fgets ($ fp, 4096); }; ); fclose($fp); fclose ($ fp); return $contents; return $ indhold; } ) } )
file_get_contents file_get_contents
function getData($url) { function getData ($ url) ( return file_get_contents($url); return file_get_contents ($ url); } )
As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you. Som du kan se den nemmeste måde at hente webindhold ved hjælp af file_get_contents funktion, men hvis du har brug for flere valgmuligheder, især hvis du arbejder med de overskrifter, så webadressen er den bedste vej at gå for dig.
No related posts. Ingen relaterede stillinger.












































