This is a translated page. The original can be found here: http://iwebdevel.com/2009/10/03/php-how-to-download-a-webpage-aka-web-scrapping-with-php-fsockopen-file_get_contents-curl-function-download-web-page/
UPDATES VIA RSS | Email AĠĠORNAMENTI VIA RSS | Email Get updates via feedburner Get updates via twitter
Home / Coding / PHP / PHP: How to download a webpa… Home / Kodifika / PHP / PHP: Kif tniżżel webpa ...

PHP: How to download a webpage (aka web scrapping) with PHP PHP: Kif tniżżel webpage (magħruf ukoll bħala web skrappjar) ma PHP

Posted on 03. Posted on 03. Oct, 2009 by Dragos in Coding , PHP Ottubru, 2009 mill Dragos fl-Kodifika, PHP

There are many ways of downloading web pages, or web content. Hemm ħafna modi ta 'tniżżil paġni tal-web, jew il-kontenut tal-web. Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents . Personally I simili għall-użu curl għall-bżonnijiet tar-rimi tal-web tiegħi, imma kultant I wkoll użu fsockopen u file_get_contents.

Here are 3 different functions that will allow you to download web content. Hawnhekk huma 3 funzjonijiet differenti li se jippermetti li inti kontenut tal-web download.

cURL : curl:

 function getData($url) { funzjoni getData ($ url) (
    if($url!='localhost' && $url!='http://localhost') { jekk ($ url! = "localhost '& & $ url! http://localhost =") (
        $ch=curl_init(); $ ch curl_init = ();
        curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ ch, CURLOPT_URL, $ url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, VERU);
        curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.9.0.1) Gecko/2008070208 Firefox/3.0.3");
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 3);
        $result['data']=curl_exec($ch); riżultat $ [ "data"] = curl_exec ($ ch);
        $result['error']=curl_error($ch); riżultat $ [ "żball"] = curl_error ($ ch);
        curl_close($ch); curl_close ($ ch);
        return $result; ritorn $ riżultat;
    } )
    else return $result['error']='err'; inkella ritorn $ riżultat [ "żball"] = "żball";
} ) 

fsockopen fsockopen

 function getData($url) { funzjoni getData ($ url) (
    $arr=parse_url($url); arr $ = parse_url ($ url);
    $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); FP $ = fsockopen ($ arr [ "host"], 80, $ errno, $ errstr, 30);
    if(!$fp) { if (! $ FP) (
        return false; ritorn foloz;
    }else { else ()
    // send headers / / Jibagħtu headers
        $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $ l = "IKOLLOK". fsockopen ($ arr [ "host"], 80, $ errno, $ errstr, 30). "HTTP/1.1 \ r \ n";
        $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $ barra .= "Ospitanti:". str_replace ( "http://". $ arr [ "url'],'',$ ospitanti)." \ r \ n ";
        $out .= "User-Agent: FSOCKOPEN\r\n"; $ barra .= "User-Agent: FSOCKOPEN \ r \ n";
        $out .= "Connection: Close\r\n\r\n"; $ barra .= "Konnessjoni: Close \ r \ n \ r \ n";
        fwrite($fp, $out); fwrite ($ FP, $ out);
        while(!feof($fp)) { filwaqt li (! feof (FP $)) (
            $contents .= fgets($fp, 4096); kontenut $ .= fgets ($ FP, 4096);
        }; );
        fclose($fp); fclose ($ FP);
        return $contents; ritorn $ kontenut;
    } )
} ) 

file_get_contents file_get_contents

 function getData($url) { funzjoni getData ($ url) (
return file_get_contents($url); ritorn file_get_contents ($ url);
} ) 

As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you. Kif tara l-eħfef mod ta 'tniżżil kontenut tal-web huwa billi tuża l-funzjoni file_get_contents, imma jekk għandek bżonn għażliet aktar, speċjalment jekk qed taħdem ma' l-intestaturi, allura curl hija l-aħjar mod biex imorru għalik.

Translate this post Ittraduċi din il-kariga


No related posts. Nru related posts.

  • saya kemarin juga ngerjain web yang grabb data dari website lain... my kemarin juga ngerjain web yang data grabb Dari lain websajt ...

    sayang ga sempat baca artikel ini, jadi nya pake file_get_contents().. sayang ga sempat Bača artikel ini, jadi Nya file_get_contents pake () .. di potong2 pake preg_replace(), buat ngambil data yang di butuh kan... di preg_replace pake potong2 (), id-data ngambil buat yang di butuh kan ...


    btw salam kenal... kenal Salam btw ...
blog comments powered by Disqus blog kummenti powered by Disqus