This is a translated page. The original can be found here: http://iwebdevel.com/2009/10/03/php-how-to-download-a-webpage-aka-web-scrapping-with-php-fsockopen-file_get_contents-curl-function-download-web-page/
UPDATES VIA RSS | Email UPDATES VIA RSS | Email Get updates via feedburner Get updates via twitter
Home / Coding / PHP / PHP: How to download a webpa… Home / Coding / PHP / PHP: Paano mag-download ng webpa ...

PHP: How to download a webpage (aka web scrapping) with PHP PHP: Paano mag-download ng isang webpage (aka web scrapping) na may PHP

Posted on 03. Posted on 03. Oct, 2009 by Dragos in Coding , PHP Oktubre, 2009 sa pamamagitan ng Dragos sa Coding, PHP

There are many ways of downloading web pages, or web content. Mayroong maraming mga paraan ng pag-download ng mga web page, o ng nilalaman sa web. Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents . Personal na gusto kong gamitin ang kulot para sa aking mga pangangailangan sa web scrapping, ngunit paminsan-minsan ko rin gamitin ang fsockopen at file_get_contents.

Here are 3 different functions that will allow you to download web content. Narito ang 3 iba't-ibang mga function na magpapahintulot sa inyo na mag-download ng nilalaman sa web.

cURL : kulot:

 function getData($url) { function getData ($ url) ( 
     if($url!='localhost' && $url!='http://localhost') { kung ($ url! = 'localhost' & & $ url! = 'http://localhost') ( 
         $ch=curl_init(); $ Ch = curl_init (); 
         curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ Ch, CURLOPT_URL, $ url); 
         curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt ($ Ch, CURLOPT_RETURNTRANSFER, TRUE); 
         curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt ($ Ch, CURLOPT_USERAGENT, "Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); 
         curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt ($ Ch, CURLOPT_FOLLOWLOCATION, 3); 
         $result['data']=curl_exec($ch); $ resulta [ 'data'] = curl_exec ($ Ch); 
         $result['error']=curl_error($ch); $ resulta [error na ''] = curl_error ($ Ch); 
         curl_close($ch); curl_close ($ Ch); 
         return $result; bumalik $ resulta; 
     } ) 
     else return $result['error']='err'; pa ang balik $ resulta [error na ''] = 'magkamali'; 
 } ) 

fsockopen fsockopen

 function getData($url) { function getData ($ url) ( 
     $arr=parse_url($url); $ arr = parse_url ($ url); 
     $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); $ fp = fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30); 
     if(!$fp) { kung (! $ fp) ( 
         return false; return false; 
     }else { ) sino pa ang paririto ( 
     // send headers / / Magpadala ng header 
         $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $ out = "GET". fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30). "HTTP/1.1 \ r \ n"; 
         $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $ out .= "Host:". str_replace ( 'http://'. $ arr [ 'host'],'',$ url). "\ r \ n"; 
         $out .= "User-Agent: FSOCKOPEN\r\n"; $ out .= "User-agent: FSOCKOPEN \ r \ n"; 
         $out .= "Connection: Close\r\n\r\n"; $ out .= "Connection: Isara \ r \ n \ r \ n"; 
         fwrite($fp, $out); fwrite ($ fp, $ out); 
         while(!feof($fp)) { habang (! feof ($ fp)) ( 
             $contents .= fgets($fp, 4096); $ nilalaman .= fgets ($ fp, 4096); 
         }; ); 
         fclose($fp); fclose ($ fp); 
         return $contents; bumalik $ nilalaman; 
     } ) 
 } ) 

file_get_contents file_get_contents

 function getData($url) { function getData ($ url) ( 
 return file_get_contents($url); bumalik file_get_contents ($ url); 
 } ) 

As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you. Tulad ng nakikita mo ang pinakamadaling paraan ng pag-download ng nilalaman sa web ay sa pamamagitan ng paggamit ng file_get_contents function, pero kung kailangan mo ng higit pang mga opsyon, lalo na kung ikaw ay nagtatrabaho sa header, at pagkatapos ay kulot ay ang pinakamahusay na paraan upang magpunta para sa iyo.

Translate this post Isalin ang post na ito


No related posts. Walang mga kaugnay na post.

  • saya kemarin juga ngerjain web yang grabb data dari website lain... saya kemarin juga ngerjain web Yang data grabb dari website lain ...

    sayang ga sempat baca artikel ini, jadi nya pake file_get_contents().. sayang ga sempat baca artikel na ito, jadi nya pake file_get_contents () .. di potong2 pake preg_replace(), buat ngambil data yang di butuh kan... di potong2 pake preg_replace (), buat ngambil data Yang di butuh kan ...


    btw salam kenal... btw salam kenal ...
blog comments powered by Disqus blog comments powered by Disqus