文章詳情頁

python獲取整個網頁源碼的方法

瀏覽：8日期：2022-07-15 13:20:58

1、Python中獲取整個頁面的代碼：

import requestsres = requests.get(’https://blog.csdn.net/yirexiao/article/details/79092355’)res.encoding = ’utf-8’print(res.text)

2、運行結果

實例擴展：

from bs4 import BeautifulSoupimport time,re,urllib2t=time.time()websiteurls={}def scanpage(url): websiteurl=url t=time.time() n=0 html=urllib2.urlopen(websiteurl).read() soup=BeautifulSoup(html) pageurls=[] Upageurls={} pageurls=soup.find_all('a',href=True) for links in pageurls: if websiteurl in links.get('href') and links.get('href') not in Upageurls and links.get('href') not in websiteurls: Upageurls[links.get('href')]=0 for links in Upageurls.keys(): try: urllib2.urlopen(links).getcode() except: print 'connect failed' else: t2=time.time() Upageurls[links]=urllib2.urlopen(links).getcode() print n, print links, print Upageurls[links] t1=time.time() print t1-t2 n+=1 print ('total is '+repr(n)+' links') print time.time()-tscanpage(http://news.163.com/)

到此這篇關于python獲取整個網頁源碼的方法的文章就介紹到這了,更多相關python如何獲取整個頁面內容請搜索好吧啦網以前的文章或繼續瀏覽下面的相關文章希望大家以后多多支持好吧啦網！

Python 編程

上一條：python爬蟲使用正則爬取網站的實現下一條：python線程里哪種模塊比較適合

相關文章：

1. Echarts通過dataset數據集實現創建單軸散點圖2. Laravel操作session和cookie的教程詳解3. css進階學習選擇符4. 阿里前端開發中的規范要求5. 解析原生JS getComputedStyle6. XML入門精解之結構與語法7. XML入門的常見問題(一)8. 將properties文件的配置設置為整個Web應用的全局變量實現方法9. html小技巧之td,div標簽里內容不換行10. PHP字符串前后字符或空格刪除方法介紹

排行榜

					
					python matlab庫簡單用法講解
Python使用shutil模塊實現文件拷貝
Python用K-means聚類算法進行客戶分群的實現
詳解Java實現設計模式之責任鏈模式
如何基于windows實現python定時爬蟲
python 實現aes256加密
python 浮點數四舍五入需要注意的地方
python批量替換文件名中的共同字符實例
Python 如何將integer轉化為羅馬數(3999以內)
Python 下載Bing壁紙的示例
解析Java實現設計模式六大原則之里氏替換原則