PHP: How to download a webpage (aka web scrapping) with PHP PHP: Kif tniżżel webpage (magħruf ukoll bħala web skrappjar) ma PHP
Posted on 03. Posted on 03. Oct, 2009 by Dragos in Coding , PHP Ottubru, 2009 mill Dragos fl-Kodifika, PHP
There are many ways of downloading web pages, or web content. Hemm ħafna modi ta 'tniżżil paġni tal-web, jew il-kontenut tal-web. Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents . Personally I simili għall-użu curl għall-bżonnijiet tar-rimi tal-web tiegħi, imma kultant I wkoll użu fsockopen u file_get_contents.
Here are 3 different functions that will allow you to download web content. Hawnhekk huma 3 funzjonijiet differenti li se jippermetti li inti kontenut tal-web download.
cURL : curl:
function getData($url) { funzjoni getData ($ url) ( if($url!='localhost' && $url!='http://localhost') { jekk ($ url! = "localhost '& & $ url! http://localhost =") ( $ch=curl_init(); $ ch curl_init = (); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ ch, CURLOPT_URL, $ url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, VERU); curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 3); $result['data']=curl_exec($ch); riżultat $ [ "data"] = curl_exec ($ ch); $result['error']=curl_error($ch); riżultat $ [ "żball"] = curl_error ($ ch); curl_close($ch); curl_close ($ ch); return $result; ritorn $ riżultat; } ) else return $result['error']='err'; inkella ritorn $ riżultat [ "żball"] = "żball"; } )
fsockopen fsockopen
function getData($url) { funzjoni getData ($ url) ( $arr=parse_url($url); arr $ = parse_url ($ url); $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); FP $ = fsockopen ($ arr [ "host"], 80, $ errno, $ errstr, 30); if(!$fp) { if (! $ FP) ( return false; ritorn foloz; }else { else () // send headers / / Jibagħtu headers $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $ l = "IKOLLOK". fsockopen ($ arr [ "host"], 80, $ errno, $ errstr, 30). "HTTP/1.1 \ r \ n"; $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $ barra .= "Ospitanti:". str_replace ( "http://". $ arr [ "url'],'',$ ospitanti)." \ r \ n "; $out .= "User-Agent: FSOCKOPEN\r\n"; $ barra .= "User-Agent: FSOCKOPEN \ r \ n"; $out .= "Connection: Close\r\n\r\n"; $ barra .= "Konnessjoni: Close \ r \ n \ r \ n"; fwrite($fp, $out); fwrite ($ FP, $ out); while(!feof($fp)) { filwaqt li (! feof (FP $)) ( $contents .= fgets($fp, 4096); kontenut $ .= fgets ($ FP, 4096); }; ); fclose($fp); fclose ($ FP); return $contents; ritorn $ kontenut; } ) } )
file_get_contents file_get_contents
function getData($url) { funzjoni getData ($ url) ( return file_get_contents($url); ritorn file_get_contents ($ url); } )
As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you. Kif tara l-eħfef mod ta 'tniżżil kontenut tal-web huwa billi tuża l-funzjoni file_get_contents, imma jekk għandek bżonn għażliet aktar, speċjalment jekk qed taħdem ma' l-intestaturi, allura curl hija l-aħjar mod biex imorru għalik.
No related posts. Nru related posts.
- ferry ardhana ardhana ferry












































