注: libcurl 入门指南( the tutorial ): http://curl.haxx.se/libcurl/c/libcurl-tutorial.html

0 为使用的 curl url 添加确定的协议头

原文:

If you specify URL without protocol:// prefix, curl will attempt to guess what protocol you might want. It will then default to HTTP but try other protocols based on often-used host name prefixes. For example, for host names starting with “ftp.” curl will assume you want to speak FTP.

1 把 curl_easy_perform() 回调数据直接写到文件中(FILE *)

原文:

libcurl offers its own default internal callback that will take care of the data if you don’t set the callback with CURLOPT_WRITEFUNCTION. It will then simply output the received data to stdout. You can have the default callback write the data to a different file handle by passing a ‘FILE *’ to a file opened for writing with the CURLOPT_WRITEDATA option.

源码中的实现:

这样,就可以少写一个回调函数了(喂,你是有多懒啊),示例如下

1
2
3
4
5
6
FILE *fp;
fp = fopen("/root/test.bmp", "wb");
...
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
...
fclose(fp);

2 curl_easy_perform 返回值处理

使用 CURLOPT_ERRORBUFFER 保存错误, buf_size=CURL_ERROR_SIZE

或使用 curl_easy_strerror(res) (感觉这个简便)

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
/* Perform the request, res will get the return code */ 
res = curl_easy_perform(curl);
/* Check for errors */ 
if(res != CURLE_OK)
{
  printf("%s curl_easy_perform() error! \n", __FUNCTION__);
  printf("error msg = %s\n",  curl_easy_strerror(res));
  curl_easy_cleanup(curl);
  return -1;
}

3 多线程环境配置 CURLOPT_NOSIGNAL

原文:

When using multiple threads you should set the CURLOPT_NOSIGNAL option to 1 for all handles. Everything will or might work fine except that timeouts are not honored during the DNS lookup - which you can work around by building libcurl with c-ares support. c-ares is a library that provides asynchronous name resolves. On some platforms, libcurl simply will not function properly multi-threaded unless this option is set.

对于 CURLOPT_TIMEOUT(默认0), CURLOPT_CONNECTTIMEOUT(默认300)选项:

In unix-like systems, this might cause signals to be used unless CURLOPT_NOSIGNAL is set.

4 设置 CURLOPT_VERBOSE、CURLOPT_HEADER 的必要性

原文:

There’s one golden rule when these things occur: set the CURLOPT_VERBOSE option to 1. It’ll cause the library to spew out the entire protocol details it sends, some internal info and some received protocol data as well (especially when using FTP). If you’re using HTTP, adding the headers in the received output to study is also a clever way to get a better understanding why the server behaves the way it does. Include headers in the normal body output with CURLOPT_HEADER set 1.

经试验:

设置 curl_easy_setopt(curl, CURLOPT_HEADER, 1L) 后,回调函数会返回 http头相关信息(原本是直接输出到stdout的),考虑到还要过滤这些信息,所以还是不要设置这个了

5. curl post 注意事项

原文:

Using POST with HTTP 1.1 implies the use of a “Expect: 100-continue” header. You can disable this header with CURLOPT_HTTPHEADER as usual.

解释:

当使用libcurl的POST方式时,如果POST数据的大小大于1024个字节,libcurl不会直接发送POST请求,而是会分为两步执行请求:

<1> 发送一个请求,该请求头部包含一个Expect: 100-continue的字段,用来询问server是否愿意接受数据

<2> 当接收到从server返回的100-continue的应答后,它才会真正的发起POST请求,将数据发送给server。

对于“100-continue"这个字段,RFC文档(http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.3)是这么解释的:

它可以让客户端在发送请求数据之前去判断服务器是否愿意接收该数据,如果服务器愿意接收,客户端才会真正发送数据,

这么做的原因是如果客户端直接发送请求数据,但是服务器又将该请求拒绝的话,这种行为将带来很大的资源开销。

所以为了避免这种情况,libcurl在发送大于1024字节的POST请求时采用了这种方法,但是相对的,它会引起请求延迟的加大,

另外并不是所有的server都会正确处理并且应答”100-continue“,比如lighttpd,就会返回417”Expectation Failed“,造成请求逻辑出错。

解决办法:

1
2
3
4
5
6
// POST数据的大于1024个字节
struct curl_slist *headerlist = NULL;
static const char buf[] = "Expect:";
headerlist = curl_slist_append(headerlist, buf); /* initalize custom header list */
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headerlist); /* set header*/
curl_slist_free_all(headerlist); /* free slist */

6 回调函数的正确返回

1
return (size * nmemb);

原因:

Your callback function should return the number of bytes it “took care of”. If that is not the exact same amount of bytes that was passed to it, libcurl will abort the operation and return with an error code.

如果回调函数中接收的数据有误,个人感觉可以返回0或者返回你已经处理的数据数,

因为源码的处理如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
/* If the previous block of data ended with CR and this block of data is
just a NL, then the length might be zero */
// len 为要发送给回调函数的数据长度
if(len) {
  wrote = data->set.fwrite_func(ptr, 1, len, data->set.out);
}
else {
  wrote = len;
}

if(wrote != len) {
  failf(data, "Failed writing body (%zu != %zu)", wrote, len);
  return CURLE_WRITE_ERROR;
}