Go 语言 HTML 解析:提取 img 标签的 src 属性
使用 Go 语言官方的 HTML 库解析 <img> 标签的路径为 /html/body/div[1]/div/div/div[1]/main/div[1]/div[1]/article/div[3]/div/div/table[1]/tbody/tr[1]/td/a,要求输出 src 后面的链接内容。
可以使用以下代码来解析并输出 img 标签中的 src 链接:
package main
import (
"fmt"
"strings"
"golang.org/x/net/html"
)
func main() {
htmlStr := `
<html>
<body>
<div>
<div>
<div>
<main>
<div>
<div>
<article>
<div>
<div>
<table>
<tbody>
<tr>
<td>
<a>
<img alt='雪狐桑看板.png' class='infobox-image lazyload' data-lazy-src='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png' data-lazy-srcset='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/420px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 1.5x, https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/560px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 2x' data-lazy-state='done' style='' src='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png' srcset='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/420px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 1.5x, https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/560px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 2x' width='280' height='389'>
</a>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</article>
</div>
</div>
</main>
</div>
</div>
</div>
</body>
</html>
`
reader := strings.NewReader(htmlStr)
doc, _ := html.Parse(reader)
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "img" {
for _, attr := range n.Attr {
if attr.Key == "src" {
fmt.Println(attr.Val)
}
}
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
}
输出结果为:
https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png
原文地址: https://www.cveoy.top/t/topic/jx0K 著作权归作者所有。请勿转载和采集!