zoukankan      html  css  js  c++  java
  • MacOS下安装BeautifulSoup库及使用

    BeautifulSoup简介


    BeautifulSoup库是一个强大的python第三方库,它可以解析html进行解析,并提取信息。

    安装BeautifulSoup


    • 打开终端,输入命令:
    pip3 install beautifulsoup4
    

    BeautifulSoup库小测


    • 查看它的源代码:

    • 用request库获得源代码(存放在变量demo中):
    >>> import requests
    >>> r = requests.get("http://python123.io/ws/demo.html")
    >>> r.text
    '<html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>
    </body></html>'
    >>> demo = r.text
    
    • 导入BeautifulSoup库
    >>> from bs4 import BeautifulSoup
    >>> 
    
    • 使用BeautifulSoup库解析html信息
    >>> demo = r.text
    >>> soup = BeautifulSoup(demo,'html.parser')
    >>> print(soup.prettify)
    <bound method Tag.prettify of <html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p>
    </body></html>>
    >>> 
    

    如何使用BeautifulSoup库?

    • 代码框架:
    from bs4 import BeautifulSoup
    soup = BeautifulSoup('<p>data</p>','html.parser')
    
    • 其中BeautifulSoup的两个参数:
      • 第一个代表我们要解析的html格式的信息。
      • 第二个代表解析所使用到的解析器
  • 相关阅读:
    js控制treeview默认展开
    java 在方法中新建线程,传参和加锁详解
    springmvc加载xml文件读取本地properties配置文件
    Android系统目录结构详解
    支付宝沙箱测试-ALI40247
    转化.vdi到.vmdk
    查看网页自动保存的密码
    天猫魔盘在 deepin-linux中的使用
    百度云-上传服务器出错误
    安装出现了error launching installer
  • 原文地址:https://www.cnblogs.com/031602523liu/p/9824907.html
Copyright © 2011-2022 走看看