python网络爬虫笔记(1) Posted on 2018-01-04 主要使用requests库和bs4库中的BeautifulSoup类 12345678910111213141516171819202122232425import requestsdemo = requests.get('http://python123.io/ws/demo.html')from bs4 import BeautifulSoupsoup = BeautifulSoup(demo,'html.parser')print(soup.a)#打印其中的a标签#标签树下行遍历#soup.a.contents 返回儿子一个列表#soup.a.childdren 返回儿子迭代类型#soup.a.descendants 返回子孙迭代类型for child in soup.body.children: print(child)#标签树的上行遍历 soup.a.parent 返回父亲soup.a.parents 返回迭代类型#标签树平行遍历(必须在同一个父标签下)soup.a.next_siblingsoup.a.previous_siblingsoup.a.next_siblings 迭代类型soup.a.previous_silblings 迭代类型#pretify()方法print(soup.pretify())print(soup.a.pretify()) XML JSON YAML 坚持原创分享,您的支持将鼓励我继续创作! 赏 微信打赏 支付宝打赏