본문 바로가기
Python

[디시인사이드] 김소혜 게시판 댓글 웹 크롤링

by 퍼포먼스마케팅코더 2017. 1. 6.
반응형

from bs4 import BeautifulSoup

import urllib.request 


if __name__ == "__main__": 

req = urllib.request.Request("http://gall.dcinside.com/board/lists/?id=kimsohye");

data = urllib.request.urlopen(req).read()

bs = BeautifulSoup(data, 'html.parser')

l = bs.find_all('a')

idx = 0 

for s in l : 

try : 

prop = s.get('class') # get class property

if prop != None and prop[0] == "icon_pic_n": 

a = "%s : %s" % (s.get('href'), s.get_text())

print(a)

except UnicodeEncodeError: 

print("Error : %d" % (idx))

finally :

idx += 1

반응형

댓글