January 29th, 2007
How to set the cURL user agent string with PHP
I just found out that my free link checking tool is being blocked by some websites. My guess is that it's because it's sending a black User-Agent string. I'm going to have to spoof it, say it's FireFox or something. Here's how to do that with cURL and PHP:
// spoofing FireFox 2.0
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
$ch = curl_init();
// set user agent
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
// set the rest of your cURL options here

on November 1st, 2007 at 5:02 pm
I can confirm this is the case when trying to scrape Google. With a user agent, the organic results are between and . Without one, they are not included.
on November 1st, 2007 at 5:02 pm
on November 1st, 2007 at 5:04 pm
Between HTML comments!