This is a translated page. The original can be found here: http://iwebdevel.com/2009/10/03/php-how-to-download-a-webpage-aka-web-scrapping-with-php-fsockopen-file_get_contents-curl-function-download-web-page/
UPDATES VIA RSS | Email Opdateringer via RSS | E-mail Get updates via feedburner Get updates via twitter
Home / Coding / PHP / PHP: How to download a webpa… Hjem / Kodning / PHP / PHP: Hvordan at hente en webpa ...

PHP: How to download a webpage (aka web scrapping) with PHP PHP: Hvordan at hente en webside (alias web ophugning) med PHP

Posted on 03. Sendt den 03. Oct, 2009 by Dragos in Coding , PHP Oktober, 2009 af Dragos i Kodning, PHP

There are many ways of downloading web pages, or web content. Der er mange måder at hente websider, eller web-indhold. Personally I like to use cURL for my web scrapping needs, but sometimes I also use fsockopen and file_get_contents . Personligt er jeg gerne bruge webadressen til min web ophugning behov, men nogle gange har jeg også bruge fsockopen og file_get_contents.

Here are 3 different functions that will allow you to download web content. Her er 3 forskellige funktioner, der giver dig mulighed for at hente webindhold.

cURL : webadressen:

 function getData($url) { function getData ($ url) ( 
     if($url!='localhost' && $url!='http://localhost') { if ($ url! = 'localhost' & & $ url! = 'http://localhost') ( 
         $ch=curl_init(); $ ch = curl_init (); 
         curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ ch, CURLOPT_URL, $ url); 
         curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, TRUE); 
         curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/6.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); curl_setopt ($ ch, CURLOPT_USERAGENT, "Mozilla/6.0 (compatible; Windows NT 5.1; en-US; rv: 1.9.0.1) Gecko/2008070208 Firefox/3.0.3"); 
         curl_setopt($ch, CURLOPT_FOLLOWLOCATION,3); curl_setopt ($ ch, CURLOPT_FOLLOWLOCATION, 3); 
         $result['data']=curl_exec($ch); $ result [ 'data'] = curl_exec ($ ch); 
         $result['error']=curl_error($ch); $ result [ 'error'] = curl_error ($ ch); 
         curl_close($ch); curl_close ($ ch); 
         return $result; return $ result; 
     } ) 
     else return $result['error']='err'; else return $ result [ 'error'] = 'err'; 
 } ) 

fsockopen fsockopen

 function getData($url) { function getData ($ url) ( 
     $arr=parse_url($url); $ arr = parse_url ($ url); 
     $fp = fsockopen($arr['host'], 80, $errno, $errstr, 30); $ fp = fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30); 
     if(!$fp) { if (! $ fp) ( 
         return false; return false; 
     }else { ) else ( 
     // send headers / / Send headers 
         $out = "GET ".fsockopen($arr['host'], 80, $errno, $errstr, 30)." HTTP/1.1\r\n"; $ out = "GET". fsockopen ($ arr [ 'host'], 80, $ errno, $ errstr, 30). "HTTP/1.1 \ r \ n"; 
         $out .= "Host: ".str_replace('http://'.$arr['host'],'',$url)."\r\n"; $ out .= "Host:". str_replace ( 'http://'. $ arr [ 'host'],'',$ url). "\ r \ n"; 
         $out .= "User-Agent: FSOCKOPEN\r\n"; $ out .= "User-Agent: fsockopen \ r \ n"; 
         $out .= "Connection: Close\r\n\r\n"; $ out .= "Connection: Close \ r \ n \ r \ n"; 
         fwrite($fp, $out); fwrite ($ fp, $ out); 
         while(!feof($fp)) { while (! feof ($ fp)) ( 
             $contents .= fgets($fp, 4096); $ contents .= fgets ($ fp, 4096); 
         }; ); 
         fclose($fp); fclose ($ fp); 
         return $contents; return $ indhold; 
     } ) 
 } ) 

file_get_contents file_get_contents

 function getData($url) { function getData ($ url) ( 
 return file_get_contents($url); return file_get_contents ($ url); 
 } ) 

As you see the easiest way of downloading web content is by using the file_get_contents function, but if you need more options, especially if you are working with the headers, then cURL is the best way to go for you. Som du kan se den nemmeste måde at hente webindhold ved hjælp af file_get_contents funktion, men hvis du har brug for flere valgmuligheder, især hvis du arbejder med de overskrifter, så webadressen er den bedste vej at gå for dig.

Translate this post Oversæt dette indlæg





No related posts. Ingen relaterede stillinger.

    blog comments powered by Disqus blog comments powered by Disqus