<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:media="http://search.yahoo.com/mrss/"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>백업 &#8211; 투데이즈.kr</title>
	<atom:link href="https://2days.kr/tag/%EB%B0%B1%EC%97%85/feed/" rel="self" type="application/rss+xml" />
	<link>https://2days.kr</link>
	<description>투데이즈</description>
	<lastBuildDate>Sun, 16 Nov 2025 13:17:31 +0000</lastBuildDate>
	<language>ko-KR</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8</generator>

<image>
	<url>https://2days.kr/wp-content/uploads/2025/10/cropped-simbol-1-32x32.png</url>
	<title>백업 &#8211; 투데이즈.kr</title>
	<link>https://2days.kr</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>티스토리 파이썬 포스팅 글, 이미지 백업하기</title>
		<link>https://2days.kr/27/09/22/6738/it/program/</link>
		
		<dc:creator><![CDATA[urjent]]></dc:creator>
		<pubDate>Wed, 27 Sep 2023 13:37:43 +0000</pubDate>
				<category><![CDATA[program]]></category>
		<category><![CDATA[백업]]></category>
		<category><![CDATA[저장]]></category>
		<category><![CDATA[크롤링]]></category>
		<category><![CDATA[티스토리]]></category>
		<category><![CDATA[파이싼]]></category>
		<guid isPermaLink="false">https://2days.kr/?p=6738</guid>

					<description><![CDATA[티스토리(tistory) 백업이 필요해서 파이썬(스크래핑)으로 블로그 포스팅 글과 이미지를 PC 에 저장 하려는 분들이 계실거라고 생각이 듭니다.

오늘은 파이썬으로 티스토리를 백업하는 방법에 대해서 알아봅니다.]]></description>
										<content:encoded><![CDATA[<p data-ke-size="size16">티스토리(tistory) 백업이 필요해서 파이썬(스크래핑)으로 블로그 포스팅 글과 이미지를 PC 에 저장 하려는 분들이 계실거라고 생각이 듭니다.</p>
<p data-ke-size="size16">오늘은 파이썬으로 티스토리를 백업하는 방법에 대해서 알아봅니다.</p>
<h4 data-ke-size="size20"><b>일단 코딩한 것이 동작하는 환경과 그 내역을 살펴보면,</b></h4>
<p data-ke-size="size16">-북클럽(Book Club) 스킨에서 카테고리 7~8개 만들고 포스팅 중입니다.</p>
<p data-ke-size="size16">-포스트 주소는 숫자로 설정해서 사용 중입니다.</p>
<p data-ke-size="size16">개발자도구(F12)에서 html 코드를 보고</p>
<p data-ke-size="size16">-requests, BeautifulSoup를 통해 스크래핑 진행했으며,</p>
<p data-ke-size="size16">&#8211;<a href="https://ko.wikipedia.org/wiki/Python_Imaging_Library" target="_blank" rel="noopener">PIL Image </a>를 통해 이미지 다운로드 시 안보이는 확장자 문제를 해결하였습니다.</p>
<p data-ke-size="size16">이미지는 src에 확장자(.jpg .png)까지 정확하게 된 것도 있었지만, 다음과 같은 형태로 포함된 URL을 가지고 있는 것도 있었습니다.</p>
<pre id="code_1630939194313" class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs">&lt;img srcset=<span class="hljs-string">"https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=http%3A%2F%2Fcfile1.uf.tistory.com%2Fimage%2F993693465F20BB0F1FAFB6"</span> src=<span class="hljs-string">"https://t1.daumcdn.net/cfile/tistory/993693465F20BB0F1F"</span> 

//i1.daumcdn.net/thumb/C176x120/?fname=https://t1.daumcdn.net/cfile/tistory/<span class="hljs-number">99400</span>A3F5F21057413</code></pre>
<p data-ke-size="size16">PIL Image로 이미지 정보를 찾으면</p>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs">img_url: https://t1.daumcdn.net/cfile/tistory/<span class="hljs-number">992895395</span>F2040A804
img_format: PNG
imge_size: (<span class="hljs-number">830</span>, <span class="hljs-number">1019</span>)
<span class="hljs-built_in">len</span>(이미지): <span class="hljs-number">41568</span></code></pre>
<h4 data-ke-size="size20"><b>소스 코드는</b></h4>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<pre id="code_1630939263129" class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"><span class="hljs-keyword">from</span> bs4 <span class="hljs-keyword">import</span> BeautifulSoup
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> PIL <span class="hljs-keyword">import</span> Image


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">tistory_backup</span><span class="hljs-params">(post_num)</span>:</span>

    <span class="hljs-keyword">for</span> num <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">1</span>, post_num + <span class="hljs-number">1</span>):
        url = <span class="hljs-string">'https://본인의 티스토리 URL/'</span> + <span class="hljs-built_in">str</span>(num)
        response = requests.get(url)
        soup = BeautifulSoup(response.text, <span class="hljs-string">'lxml'</span>)
        
        <span class="hljs-comment">### 포스팅 글 제목</span>
        titles = soup.select_one(<span class="hljs-string">'#content &gt; div.inner &gt; div.post-cover &gt; div &gt; h1'</span>)
        
        <span class="hljs-comment">### 등록일</span>
        date = soup.select_one(<span class="hljs-string">'#content &gt; div.inner &gt; div.post-cover &gt; div &gt; span.meta &gt; span.date'</span>)
        
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> titles <span class="hljs-keyword">or</span> <span class="hljs-keyword">not</span> date:
            <span class="hljs-keyword">continue</span>
        
        <span class="hljs-built_in">print</span>(titles.text)    
        <span class="hljs-built_in">print</span>(date.text)
        
        <span class="hljs-comment">### 포스팅 내용</span>
        entry_content = soup.find(<span class="hljs-string">'div'</span>, {<span class="hljs-string">'class'</span>:<span class="hljs-string">'entry-content'</span>})
        <span class="hljs-built_in">print</span>(entry_content.get_text())
        
        res = requests.get(url)
        soup_img = BeautifulSoup(res.content, <span class="hljs-string">'lxml'</span>)
        imgs = soup_img.select(<span class="hljs-string">'img[src^=https]'</span>)  <span class="hljs-comment"># https 로 시작하는 src, '//'로 시작하는 src 제외시킴</span>
        <span class="hljs-built_in">print</span>(<span class="hljs-string">f'이미지 수 : <span class="hljs-subst">{<span class="hljs-built_in">len</span>(imgs)}</span>'</span>)
        <span class="hljs-comment"># print(imgs)</span>
        
        <span class="hljs-comment"># 저장 디렉토리 만들기</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.path.exists(<span class="hljs-string">'tistoryBackup'</span>):
            os.mkdir(<span class="hljs-string">'tistoryBackup'</span>)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.path.exists(<span class="hljs-string">'tistoryBackup/post_'</span> + <span class="hljs-built_in">str</span>(num)):
            os.makedirs(<span class="hljs-string">'tistoryBackup/post_'</span> + <span class="hljs-built_in">str</span>(num))
        
        cnt = <span class="hljs-number">1</span>
        <span class="hljs-keyword">for</span> img <span class="hljs-keyword">in</span> imgs:
            img_url = img[<span class="hljs-string">'src'</span>]
            
            <span class="hljs-comment">## pillow.Image로 이미지 format 알아내기</span>
            imageObj = Image.<span class="hljs-built_in">open</span>(requests.get(img_url, stream=<span class="hljs-keyword"><span class="hljs-literal">True</span></span>).raw)
            img_format = imageObj.<span class="hljs-built_in">format</span>
            imge_size = imageObj.size
            <span class="hljs-built_in">print</span>(<span class="hljs-string">f'img_url: <span class="hljs-subst">{img_url}</span>'</span>)
            <span class="hljs-built_in">print</span>(<span class="hljs-string">f'img_format: <span class="hljs-subst">{img_format}</span>'</span>)
            <span class="hljs-built_in">print</span>(<span class="hljs-string">f'imge_size: <span class="hljs-subst">{imge_size}</span>'</span>)
            <span class="hljs-built_in">print</span>(<span class="hljs-string">f'os.path.basename(img_url): <span class="hljs-subst">{os.path.basename(img_url)}</span>'</span>)
            
            res_img = requests.get(img_url).content
            <span class="hljs-built_in">print</span>(<span class="hljs-string">f'len(이미지): <span class="hljs-subst">{<span class="hljs-built_in">len</span>(res_img)}</span>'</span>)  <span class="hljs-comment"># requests의 .content는 bytes 타입을 리턴함</span>
            
            <span class="hljs-keyword">if</span> img_url.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">-1</span>] <span class="hljs-keyword">in</span> [<span class="hljs-string">'png'</span>, <span class="hljs-string">'jpg'</span>]:
                img_name = <span class="hljs-built_in">str</span>(num) + <span class="hljs-string">'_'</span> + <span class="hljs-built_in">str</span>(cnt) + <span class="hljs-string">'_'</span> + os.path.basename(img_url)
            <span class="hljs-keyword">else</span>:
                img_name = <span class="hljs-built_in">str</span>(num) + <span class="hljs-string">'_'</span> + <span class="hljs-built_in">str</span>(cnt) + <span class="hljs-string">'_'</span> + <span class="hljs-string">'no_filename_img.'</span> + img_format
            
            <span class="hljs-built_in">print</span>(img_name)
            
            <span class="hljs-keyword">if</span> <span class="hljs-built_in">len</span>(res_img) &gt; <span class="hljs-number">100</span>:  <span class="hljs-comment"># 이미지 용량이 00 bytes 이상인 것만</span>
                <span class="hljs-keyword">with</span> <span class="hljs-built_in">open</span>(<span class="hljs-string">'./tistoryBackup/post_'</span> + <span class="hljs-built_in">str</span>(num) + <span class="hljs-string">'/'</span> + img_name, <span class="hljs-string">'wb'</span>) <span class="hljs-keyword">as</span> f:
                    f.write(res_img)
                cnt += <span class="hljs-number">1</span>
        
        title_content = titles.text + <span class="hljs-string">'\n'</span> + date.text +  <span class="hljs-string">'\n'</span> + entry_content.get_text()
        filename = <span class="hljs-built_in">str</span>(num) + <span class="hljs-string">'_tistory_title_content.txt'</span>
        <span class="hljs-keyword">with</span> <span class="hljs-built_in">open</span>(<span class="hljs-string">'./tistoryBackup/post_'</span> + <span class="hljs-built_in">str</span>(num) + <span class="hljs-string">'/'</span> + filename, <span class="hljs-string">'w'</span>, encoding=<span class="hljs-string">'utf-8'</span>) <span class="hljs-keyword">as</span> f:
            f.write(title_content)
        
tistory_backup(<span class="hljs-number">20</span>)</code></pre>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<p data-ke-size="size16">tistory_backup(20) 실행 시, 20은 포스트 주소의 숫자.<br />
즉, https://abc4u.tistory.com/1 ~ https://abc4u.tistory.com/20 까지의 포스트 url을 대상으로 추출한다는 의미이며, 본인의 최근 포스팅 번호를 넣으면 1번 부터 최근 번호까지 전체가 추출됨</p>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<p>&nbsp;</p>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<figure class="imageblock alignLeft" data-ke-mobilestyle="widthOrigin" data-filename="tistory_backup2.JPG" data-origin-width="1095" data-origin-height="480"><span data-url="https://blog.kakaocdn.net/dn/XAcpk/btrec5uIPJw/6pa3ixI8zMWmcNKcRnVtf1/img.jpg" data-lightbox="lightbox" data-alt="티스토리 백업"><img decoding="async" src="https://blog.kakaocdn.net/dn/XAcpk/btrec5uIPJw/6pa3ixI8zMWmcNKcRnVtf1/img.jpg" srcset="https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FXAcpk%2Fbtrec5uIPJw%2F6pa3ixI8zMWmcNKcRnVtf1%2Fimg.jpg" data-filename="tistory_backup2.JPG" data-origin-width="1095" data-origin-height="480" alt="img" title="티스토리 파이썬 포스팅 글, 이미지 백업하기 1"></span><figcaption>티스토리 백업</figcaption></figure><div class='code-block code-block-2' style='margin: 8px auto; text-align: center; display: block; clear: both;'>
<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-8940400388075870"
     crossorigin="anonymous"></script>
<!-- 중간 -->
<ins class="adsbygoogle"
     style="display:block"
     data-ad-client="ca-pub-8940400388075870"
     data-ad-slot="8794586137"
     data-ad-format="auto"
     data-full-width-responsive="true"></ins>
<script>
     (adsbygoogle = window.adsbygoogle || []).push({});
</script></div>

<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<p>&nbsp;</p>
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<p data-ke-size="size16">
<pre class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"></code></pre>
<p data-ke-size="size16">소스코드를 실행하면 위 탐색기 이미지처럼 폴더를 생성하고, 글은 .txt 파일로 저장하고 해당 포스트에 있는 이미지전체는 이름을 다시 만들어져서 저장됩니다.</p>
<pre id="code_1630939216243" class="python" data-ke-language="python" data-ke-type="codeblock"><code class="hljs"><span class="hljs-number"> </span></code></pre>
<!-- CONTENT END 1 -->
]]></content:encoded>
					
		
		
		<media:content url="https://2days.kr/wp-content/uploads/2023/02/python_web-parsing-e1619745103425-862x575-1.jpg" medium="image"></media:content>
            	</item>
	</channel>
</rss>
