[네이버] 웹문서 섹션 크롤링

import urllib.request

import urllib.parse

from bs4 import BeautifulSoup

defaultURL = 'https://openapi.naver.com/v1/search/webkr.xml?'

start = '&start=1'

display = '&display=100'

query = '&query='+urllib.parse.quote_plus(str(input("Keyword: ")))

fullURL = defaultURL + start + display + query

print(fullURL)

file = open("C:\\Python34\\naver_web_4.txt","w",encoding='utf-8')

headers = {

'Host' : 'openapi.naver.com' ,

'User-Agent' : 'curl/7.49.1',

'Accept' : '*/*',

'Content-Type' : 'application/xml',

'X-Naver-Client-Id' : 'Naver Client Id',

'X-Naver-Client-Secret' : 'Naver Client Secret'

}

req = urllib.request.Request(fullURL, headers=headers)

f = urllib.request.urlopen(req)

resultXML = f.read( )

xmlsoup = BeautifulSoup(resultXML,'html.parser')

items = xmlsoup.find_all('item')

for item in items :

file.write('웹문서제목 : ' + item.title.get_text(strip=True)) + file.write('\\웹문서내용 : ' + item.description.get_text(strip=True)) + file.write('\\웹문서링크 : ' + item.link.get_text(strip=True) + '\n')

file.close( )

저작자표시

'Python' 카테고리의 다른 글

[파이썬3]URL 여러 페이지 자동으로 만들기 (0)	2017.01.09
[네이버]Open API 없이 네이버 뉴스 웽 크롤러 (0)	2017.01.09
[네이버] 파이썬 네이버 카페 크롤링 (4)	2017.01.09
[네이버]뉴스 크롤링 (0)	2017.01.08
[파이썬3] beautiful soup 예제 (0)	2017.01.06

퍼포먼스 마케팅 데이터 분석

[네이버] 웹문서 섹션 크롤링

'Python' 카테고리의 다른 글

댓글

티스토리툴바

[네이버] 웹문서 섹션 크롤링

'Python' 카테고리의 다른 글

관련글

댓글

티스토리툴바