PHP cURL Methods Time Out on Some URLs Command Line Always Works

Oct 21
10:59

2015

Rossy Guide

Rossy Guide

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

PHP is a server-side scripting language designed for web development but also used as a general-purpose programming language. While PHP originally stood for Personal Home Page, it now stands for PHP:

mediaimage

Hypertext Preprocessor,PHP cURL Methods Time Out on Some URLs Command Line Always Works Articles which is a recursive backronym. PHP code can be simply mixed with HTML code, or it can be used in combination with various templating engines and web frameworks.

cURL (Client for URLs) is a library that lets you make HTTP requests in PHP. It is a computer software project providing a library and command-line tool for transferring data using various protocols. The cURL project produces two products, libcurl and cURL.

To use PHP's cURL methods for SOME URLs, it times out. When you use the commandline for the same URL, it works just fine. Using AWS and have a t2.medium box running the php-55 apache libraries from yum.

The PHP code is:

function curl($url) {

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_AUTOREFERER, true);

curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36');

curl_setopt($ch, CURLOPT_HEADER, true);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

curl_setopt($ch, CURLOPT_VERBOSE, true);

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

curl_setopt($ch, CURLOPT_MAXREDIRS, 2);

curl_setopt($ch, CURLOPT_HTTPHEADER, array(

    'Accept-Language: en-us'

));

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);

curl_setopt($ch, CURLOPT_TIMEOUT, 10);

curl_setopt($ch, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);

$fh = fopen('/home/ec2-user/curllog', 'w');

curl_setopt($ch, CURLOPT_STDERR, $fh);

$a = curl_exec($ch);

curl_close($ch);

fclose($fh);

$headers = explode("n",$a);

var_dump($headers);

var_dump($a);

exit;

 

return $result;

}

So here is call that works just fine:

curl('http://www.google.com');

And this returns the data for the homepage of google.

However, you try another URL:

curl('http://www.trulia.com/profile/agent-1391347/overview');

And you get this in the curllog:

[ec2-user@central Node]$ cat ../curllog

* Hostname was NOT found in DNS cache

*   Trying 23.0.160.99...

* Connected to www.trulia.com (23.0.160.99) port 80 (#0)

> GET /profile/agent-1391347/overview HTTP/1.1

User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36

Host: www.trulia.com

Accept: */*

Accept-Language: en-us

 

* Operation timed out after 10002 milliseconds with 0 bytes received

* Closing connection 0

If you run this from the command line:

curl -s www.trulia.com/profile/agent-1391347/overview

It IMMEDIATELY returns (within 1 second) with NO output. This is expected. However when you run this:

curl -sL www.trulia.com/profile/agent-1391347/overview

It returns the page properly, just as you would want.

So, what is wrong with our curl?

PHP 5.5.20

Here is the cURL bit from my phpinfo():

curl

 

cURL support => enabled

cURL Information => 7.38.0

Age => 3

Features

AsynchDNS => Yes

CharConv => No

Debug => No

GSS-Negotiate => No

IDN => Yes

IPv6 => Yes

krb4 => No

Largefile => Yes

libz => Yes

NTLM => Yes

NTLMWB => Yes

SPNEGO => Yes

SSL => Yes

SSPI => No

TLS-SRP => No

Protocols => dict, file, ftp, ftps, gopher, http, https, imap, imaps, ldap, ldaps, pop3, pop3s, rtsp, scp, sftp, smtp, smtps, telnet, tftp

Host => x86_64-redhat-linux-gnu

SSL Version => NSS/3.16.2 Basic ECC

ZLib Version => 1.2.7

libSSH Version => libssh2/1.4.2

Try increasing the timeout values in the following lines:

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);

curl_setopt($ch, CURLOPT_TIMEOUT, 10);

Those are pretty short timeout values - the CURLOPT_TIMEOUT specifically limits the entire execution time, try giving larger values:

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 15);

curl_setopt($ch, CURLOPT_TIMEOUT, 30);