首页 > itarticle > 在 .net core 截取網頁

在 .net core 截取網頁

admin 11月 23, 2020 0

在寫網站時，都會將某些邏輯獨立在別的 Web API，而在呼叫時就肯定會用到 HttpClient，使用上其實也不難，如以下範例所示：

HttpClient client = new HttpClient();
client.BaseAddress = new Uri(@"https://example.com");
var response = await client.GetAsync(@"/");
string result = await response.Content.ReadAsStringAsync();

如果要截取網頁的原始碼，基本做法是一樣的，但會牽涉到 decode 的問題，需額外做處理：

HttpClient client = new HttpClient();
client.BaseAddress = new Uri(@"https://www.google.com");

var response = await client.GetAsync(@"/");
string htmlStr = await response.Content.ReadAsStringAsync();
string htmlDecodeStr = System.Web.HttpUtility.HtmlDecode(htmlStr);
string urlDecodeStr = System.Text.RegularExpressions.Regex.Unescape(htmlDecodeStr);

做完字串的處理後，就可以對其做邏輯上的處理。