Category Archives: Program

[scrapyd] scrapyd 실행 좌충우돌기

제목: scrapyd 실행 좌충우돌기
# 환경 centos 7.3.1611 , python3.5, virtualenv

# install
$ pip install scrapyd # 1.1.1 설치됨. 2017.4.5일자

# start
$ scrapyd

# but error below

=> TypeError: a bytes-like object is required, not ‘str’
# 아직 python3를 지원하지 않아서 생긴문제
# so reinstall as below
$ pip install scrapyd’==1.2.0a1′
ref: https://github.com/scrapy/scrapyd/issues/143

# start
$ scrapyd

# scrapyd-client install : scrapyd-deploy를 이용하기 위해(deploy를 편하게 해줌)
$ pip install scrapyd-client

# deploy
$ scrapyd-deploy
c.f. 환경설정파일은 https://github.com/scrapy/scrapyd-client

# scheduling
$ curl http://localhost:6800/schedule.json -d project=crawlproject -d spider=crawlspider

# cancel scheduling
$ curl http://localhost:6800/cancel.json -d project=crawlproject -d job=<jobid>

 

# 에러발생 – schedule 명령시 에러 발생
TypeError: __init__() got an unexpected keyword argument ‘_job’
#해결책: 다음과 같이 spider의 __init__ 메서드 signature 수정
원래: def __init__(self):
수정: def __init__(self, **kwargs):

# cf. http://stackoverflow.com/questions/17975472/scrapyd-init-error-when-running-scrapy-spider
# 참고로 scheduling하지 않고 그냥 scrapy crawl spidername 하면 잘 돌아감

# scrapyd를 사용하면서 느낀 점 웹 UI가 좋지 않음.
scrapinghub를 쓰는 것이 좋겠다.

크롤링하면서 크롤하는 곳의 ip를 따진다면 scrapinghub를 사용할 수 없지만 그 외에는
scrapinghub를 쓰는 것이 좋다.

[scrapinghub] pymysql 추가하기

scrapinghub에 pymysql 추가하기

작업용 서버로 centos 7을 사용하고 있다.
파이썬은 한글지원이 필요해서 python3을 사용하고 있다.
python3버전에서는 mysql connect가 pymysql이 설치가 용이하여 이를 선택하였다.

 

개요: scrapinghub를 이용해서 작업을 하는데 pymysql 의존성을 설치하는 방법
요지: requirement.txt에 PyPI에서 받을 수 있는 항목을 적는다.

스택설정

python3을 사용하기 위한 설정

scrapinghub.yml을 다음과 같이 수정

[scrapinghub.yml]

projects:
default: 1111
requirements:
file: ./requirements.txt
stacks:
default: scrapy:1.3-py3

 

의존성추가

참고: 의존성 모듈 추가
c.f. http://help.scrapinghub.com/scrapy-cloud/migrating-dependencies-to-scrapy-cloud-20

$ shub migrate-eggs # scrapinghub의 의존성 모듈을 로컬로 가져온다.

requirement.txt 파일에 다음을 추구한다.

[requirement.txt]

PyMySQL==0.7.10
$ shub deploy # cloud에 올린다.

A Few Handy File Transfer Tools (All Written in Python)

출처: http://www.willdonnelly.net/blog/file-transfer/

 

File transfer: the eternal problem.

The file is on your computer, you want to get it onto your friend’s computer, and you’re in a hurry. Luckily, there are tools to help. All of these tools are single files, written in python, and (probably) work on Windows (although I haven’t checked)

Single-File Tools

Droopy

Droopy is a wonderful little script. It’s wonderfully useful when your friend has a file they need to send to you, and you don’t want to make them install anything. Just run it (possibly with some arguments) and it will start a little miniature web-server to which anyone can upload a file, and then it exits. Command line options make it possible to optionally display a message and/or an icon on the upload page.

Woof

Woof (Web Offer One File) is the opposite to Droopy. Instead of recieving a single file, Woof sends one. It can also optionally send an entire folder as a single tar archive. Another nice feature of woof is its “-s” option, which will cause the client to be served an identical copy of the woof.py script.

Heavyweight Contenders

These tools are meant for jobs a little larger than the ones above. The tools already mentioned concern themselves with a single all-important file transfer, which is all well and good. But like the difference between claw hammer and a sledgehammer, sometimes you just need a tool with a little more heft. That’s where these scripts come in.

Python HTTP Server

This tool is only barely fancy enough to need mentioning, but the operative word here is “barely”. Unknown to almost everyone, a full-featured static HTTP server lurks within the standard python executable under the guise of an “example”. All that you have to do is call “python -m SimpleHTTPServer” in a console window, and it will happily begin serving up the current directory on port 8000.

pyftpdlib

All of these scripts are wonderful tools, just perfect for their intended niches. But sometimes, your needs are more complex. What if you need to both send and recieve multiple files? In this case, our old standby FTP comes to hand. In this case, the pyftpdlib library should be our tool of choice. While at first glance it looks rather daunting, a tarball filled with numerous files and subdirectories, the library itself consists of just one file, ftpserver.py. With a few modifications, this file can be the perfect portable, install-less FTP server. I spent about an hour today putting together a modified version and now have about the nicest little FTP server I’ve ever had the joy of playing with. Just download the FTP server library, replace the test() and main functions at the end with my 50-line changes (basically just giving it command-line options), and you’ll have a user-friendly tool for sharing any kind of files you may need. I love collecting little tools like this. One never knows when they might come in handy.

[shared_ptr]

3)참조 방법
shared_ptr이 참조하는 실제 객체를 얻는 방법은 명시적/암시적의 두 가지 방법이 있다.
명시적 방법
shared_ptr::get()
: 참조하고 있는 객체의 주소를 반환한다.
암시적 방법
shared_ptr::operator*
: 참조하고 있는 객체 자체를 반환한다.
: 즉, *(get())의 의미
shared_ptr::operator->
: get()->의 의미가 같다.

[VB] change row and column


Public Function PivotTable(ByVal  As DataTable) As DataTable
 Dim dest As New DataTable("Pivoted" + .TableName)

 dest.Columns.Add(" ")

 Dim r As DataRow
 For Each r In .Rows
 dest.Columns.Add(r(0).ToString())
 Next r
 Dim i As Integer
 For i = 0 To (.Columns.Count - 1) - 1
 dest.Rows.Add(dest.NewRow())
 Next i

 For i = 0 To dest.Rows.Count - 1
 Dim c As Integer
 For c = 0 To dest.Columns.Count - 1
 If c = 0 Then
 dest.Rows(i)(0) = .Columns((i + 1)).ColumnName
 Else
 dest.Rows(i)(c) = .Rows((c - 1))((i + 1))
 End If
 Next c
 Next i
 dest.AcceptChanges()
 Return dest
 End Function 'PivotTable

출처: http://aspalliance.com/538_CodeSnip_Pivot_Tables_with_ADONET_and_Display_in_a_DataGrid_Paged_Horizontally