hpple 使用

1
2
3
4
5
6
7
8
9
10
11
12
13
#import "TFHpple.h"

NSData  *data      = [NSData dataWithContentsOfFile:@"index.html"];
TFHpple *doc       = [[TFHpple alloc] initWithHTMLData:data];
NSArray *elements  = [doc search:@"//a[@class='sponsor']"];

TFHppleElement *element = [elements objectAtIndex:0];
[e raw];                        // 未处理内容
[e text];                       // The text inside the HTML element (the content of the first text node)
[e tagName];                    // "a"
[e attributes];                 // NSDictionary of href, class, id, etc.
[e objectForKey:@"href"];       // Easy access to single attribute
[e firstChildWithTagName:@"b"]; // The first "b" child node

遇到的一些问题

非UTF-8页面

1、编码转换。网页编码(比如:gb2312)转换为utf8编码。

2、把其中的

<meta http-equiv="Content-Type" content="text/html; charset=gb2312">

替换为

<meta http-equiv="Content-Type" content="text/html; charset=utf8">