使用 Go 语言官方的 HTML 库解析 <img> 标签的路径为 /html/body/div[1]/div/div/div[1]/main/div[1]/div[1]/article/div[3]/div/div/table[1]/tbody/tr[1]/td/a,要求输出 src 后面的链接内容。

可以使用以下代码来解析并输出 img 标签中的 src 链接:

package main

import (
    "fmt"
    "strings"

    "golang.org/x/net/html"
)

func main() {
    htmlStr := `
        <html>
            <body>
                <div>
                    <div>
                        <div>
                            <main>
                                <div>
                                    <div>
                                        <article>
                                            <div>
                                                <div>
                                                    <table>
                                                        <tbody>
                                                            <tr>
                                                                <td>
                                                                    <a>
                                                                        <img alt='雪狐桑看板.png' class='infobox-image lazyload' data-lazy-src='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png' data-lazy-srcset='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/420px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 1.5x, https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/560px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 2x' data-lazy-state='done' style='' src='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png' srcset='https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/420px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 1.5x, https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/560px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png 2x' width='280' height='389'>
                                                                    </a>
                                                                </td>
                                                            </tr>
                                                        </tbody>
                                                    </table>
                                                </div>
                                            </div>
                                        </article>
                                    </div>
                                </div>
                            </main>
                        </div>
                    </div>
                </div>
            </body>
        </html>
    `
    reader := strings.NewReader(htmlStr)
    doc, _ := html.Parse(reader)

    var f func(*html.Node)
    f = func(n *html.Node) {
        if n.Type == html.ElementNode && n.Data == "img" {
            for _, attr := range n.Attr {
                if attr.Key == "src" {
                    fmt.Println(attr.Val)
                }
            }
        }
        for c := n.FirstChild; c != nil; c = c.NextSibling {
            f(c)
        }
    }
    f(doc)
}

输出结果为:

https://img.moegirl.org.cn/common/thumb/5/5d/%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png/280px-%E9%9B%AA%E7%8B%90%E6%A1%91%E7%9C%8B%E6%9D%BF.png
Go 语言 HTML 解析:提取 img 标签的 src 属性

原文地址: https://www.cveoy.top/t/topic/jx0K 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录