1. JSON

JSON은 JavaScript Object Notation의 약자로서 JavaScript 문법에 영향을 받아 개발된 Lightweight한 데이타 표현 방식이다. JSON은 데이타를 교환하는 한 포맷으로서 그 단순함과 유연함 때문에 널리 사용되고 있다. 특히 웹 브라우져와 웹서버 사이에 데이타를 교환하는데 많이 사용되고 있다. 가장 많이 사용되는 JSON 포맷은 Key-Value Pair의 컬렉션이다.

Python은 기본적으로 JSON 표준 라이브러리(json)를 제공하고 있는데, "import json" 을 사용하여 JSON 라이브러리를 사용할 수 있다 (주: Python 2.6 이상). 
JSON 라이브러리를 사용하면, Python 타입의 Object를 JSON 문자열로 변경할 수 있으며(JSON 인코딩), 또한 JSON 문자열을 다시 Python 타입으로 변환할 수 있다 (JSON 디코딩).

2. JSON 인코딩

Python Object (Dictionary, List, Tuple 등) 를 JSON 문자열로 변경하는 것을 JSON Encoding 이라 부른다. JSON 인코딩을 위해서는 우선 json 라이브러리를 import 한 후, json.dumps() 메서드를 써서 Python Object를 문자열로 변환하면 된다.

예를 들어, 아래 코드는 customer 라는 Python Dictionary 객체를 JSON 문자열로 인코딩하는 예이다. 결과물 jsonString은 JSON 표현을 갖는 문자열(str 타입)이다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import json
 
# 테스트용 Python Dictionary
customer = {
    'id': 152352,
    'name': '강진수',
    'history': [
        {'date': '2015-03-11', 'item': 'iPhone'},
        {'date': '2016-02-23', 'item': 'Monitor'},
    ]
}
 
# JSON 인코딩
jsonString = json.dumps(customer)
 
# 문자열 출력
print(jsonString)
print(type(jsonString))   # class str

위의 코드를 실행하면 JSON 문자열이 한 줄로 길게 표현됨을 알 수 있다. 이렇게 축약된 JSON 문자열은 다른 컴퓨터나 네트워크 상에 보낼 때 유용하지만, 화면에 표시할 필요가 있을 경우는 읽기가 불편하다. JSON 문자열을 읽기 편하게 할 필요가 있을 경우에는, 아래와 같이 "indent" 옵션을 json.dumps() 메서드 안에 지정하면 된다. 코드 아래는 Identation 이 사용된 JSON 문자열 표현이다.

1
2
jsonString = json.dumps(customer, indent=4)
print(jsonString)
{
    "history": [
        {
            "date": "2015-03-11",
            "item": "iPhone"
        },
        {
            "date": "2016-02-23",
            "item": "Monitor"
        }
    ],
    "id": 152352,
    "name": "\uac15\uc9c4\uc218"
}

3. JSON 디코딩

JSON 문자열을 Python 타입 (Dictionary, List, Tuple 등) 으로 변경하는 것을 JSON Decoding 이라 부른다. JSON 디코딩은 json.loads() 메서드를 사용하여 문자열을 Python 타입으로 변경하게 된다.

아래 예제는 JSON 문자열을 Python Dictionary로 변경한 예이다.

1
2
3
4
5
6
7
8
9
10
11
12
import json
 
# 테스트용 JSON 문자열
jsonString = '{"name": "강진수", "id": 152352, "history": [{"date": "2015-03-11", "item": "iPhone"}, {"date": "2016-02-23", "item": "Monitor"}]}'
 
# JSON 디코딩
dict = json.loads(jsonString)
 
# Dictionary 데이타 체크
print(dict['name'])
for h in dict['history']:
    print(h['date'], h['item'])


클래스 vs 함수, 함수만 쓰면 되지 클래스를 도대체 왜 써야하는거야?

https://stackoverflow.com/questions/18202818/classes-vs-functions

 

 

51
19

Functions are easy to understand even for someone without any programming experience, but with a fair math background. On the other hand, classes seem to be more difficult to grasp.

Let's say I want to make a class/function that calculates the age of a person given his/her birthday year and the current year. Should I create a class for this, or a function? Or is the choice dependant to the scenario?

P.S. I am working on Python, but I guess the question is generic.

  •  
    Classes are for bigger product. In simple terms, thinking of nut and bolt of a car as objects, while even car is an object too. If you are writing sample programs for fun, stick with functions. – Saran-san Aug 13 '13 at 7:17 
  • 1
    I wrote up my thoughts here – Toby May 1 '14 at 19:15

8 Answers

答えが見つからない?日本語で聞いてみましょう。

89
 

Create a function. Functions do specific things, classes are specific things.

Classes often have methods, which are functions that are associated with a particular class, and do things associated with the thing that the class is - but if all you want is to do something, a function is all you need.

Essentially, a class is a way of grouping functions (as methods) and data (as properties) into a logical unit revolving around a certain kind of thing. If you don't need that grouping, there's no need to make a class.

  • 4
    This is not exactly true. Classes also allow for dispatching on types. If you have a group or template of collaborating functions it may well be a one shot execution but nevertheless using a class buys you the possibility of selectively implementing different functions in the group in order to get specializations of the template. This is not so easily done by plain functions, because there is no type to act as a link between them and to allow specialization on subtypes... – memeplex Mar 1 '16 at 7:08 
  •  
    ... Forcing your logic, you can include the class functions (for example as a vtable) as part of the state or data of the instance. The functions themselves are not stored within the instance but some sort of indirect reference to them instead. So the bottom line is that you may have no data (besides a pointer to a vtable), you may have no more than an instance, but still a class could be a good idea if you need to implement a family of related algorithms consisting of collaborating functions. – memeplex Mar 1 '16 at 7:09 
  •  
    To summarize: a class groups functions and data. There is no benefit to this under any circumstance. We have functions and data types and modules to group them together. Even if you need adhoc function polymorphism: different function implementations for different data types, there are much better solutions than OOP classes; consider Haskell-style type classes which are ultimately language agnostic. – clay May 23 '16 at 15:46

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

위의 성님들이 발화를 정리하면..

클래스는 여러 함수를 묵는 논리적 단위이다. 함수를 그루핑해서 이득을 볼 수 있는 것이 아니라면 굳이 클래스를 사용할 필요가 없다.

 

그럼 클래스는 어떤 기능을 하는 걸까?

1. java와 같은 언어에서 Class는 type을 선언하는 기능이 있다.

2. 인스턴스를 생성해서 값을 초기화하고, 변수를 돌려 사용할 수 있다.

3. 상속을 사용하여, 효율적인 함수의 재활용이 가능하다.

4. 추상클래스가 존재하여 개발자에게 필수적인 함수의 틀을 제공하고 개발을 강제화할 수 있어 안정성이 높다.

 

오늘의 본문, 파이썬이나 자바스크립트에서는 클래스를 사용해야할 이유가 있을까?

먼저, 파이썬이나 자바스크립트는 class type이 존재하지 않기 때문에, 1번과 같은 장점은 얻을 수 없다. 

1. java와 같은 언어에서 Class는 type을 선언하는 기능이 있다.

2. 인스턴스를 생성해서 값을 초기화하고, 변수를 돌려 사용할 수 있다.

3. 상속을 사용하여, 효율적인 함수의 재활용이 가능하다.

4. 추상클래스가 존재하여 개발자에게 필수적인 함수의 틀을 제공하고 개발을 강제화할 수 있어 안정성이 높다.

 

그래도 요건에 따라, 2,3,4 정도의 클래스를 사용하는 이득은 볼 수 있을 듯하다.

https://realpython.com/python-requests/#content


Python’s Requests Library (Guide)

by Alex Ronquillo  Jan 23, 2019 11 Comments  intermediate web-dev

 Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Making HTTP Requests With Python

The requests library is the de facto standard for making HTTP requests in Python. It abstracts the complexities of making requests behind a beautiful, simple API so that you can focus on interacting with services and consuming data in your application.

Throughout this article, you’ll see some of the most useful features that requests has to offer as well as how to customize and optimize those features for different situations you may come across. You’ll also learn how to use requests in an efficient way as well as how to prevent requests to external services from slowing down your application.

In this tutorial, you’ll learn how to:

  • Make requests using the most common HTTP methods
  • Customize your requests’ headers and data, using the query string and message body
  • Inspect data from your requests and responses
  • Make authenticated requests
  • Configure your requests to help prevent your application from backing up or slowing down

Though I’ve tried to include as much information as you need to understand the features and examples included in this article, I do assume a very basic general knowledge of HTTP. That said, you still may be able to follow along fine anyway.

Now that that is out of the way, let’s dive in and see how you can use requests in your application!

Getting Started With requests

Let’s begin by installing the requests library. To do so, run the following command:

$ pip install requests

If you prefer to use Pipenv for managing Python packages, you can run the following:

$ pipenv install requests

Once requests is installed, you can use it in your application. Importing requests looks like this:

import requests

Now that you’re all set up, it’s time to begin your journey through requests. Your first goal will be learning how to make a GET request.

The GET Request

HTTP methods such as GET and POST, determine which action you’re trying to perform when making an HTTP request. Besides GET and POST, there are several other common methods that you’ll use later in this tutorial.

One of the most common HTTP methods is GET. The GET method indicates that you’re trying to get or retrieve data from a specified resource. To make a GET request, invoke requests.get().

To test this out, you can make a GET request to GitHub’s Root REST API by calling get() with the following URL:

>>>
>>> requests.get('https://api.github.com')
<Response [200]>

Congratulations! You’ve made your first request. Let’s dive a little deeper into the response of that request.

The Response

Response is a powerful object for inspecting the results of the request. Let’s make that same request again, but this time store the return value in a variable so that you can get a closer look at its attributes and behaviors:

>>>
>>> response = requests.get('https://api.github.com')

In this example, you’ve captured the return value of get(), which is an instance of Response, and stored it in a variable called response. You can now use response to see a lot of information about the results of your GET request.

Status Codes

The first bit of information that you can gather from Response is the status code. A status code informs you of the status of the request.

For example, a 200 OK status means that your request was successful, whereas a 404 NOT FOUND status means that the resource you were looking for was not found. There are many other possible status codes as well to give you specific insights into what happened with your request.

By accessing .status_code, you can see the status code that the server returned:

>>>
>>> response.status_code
200

.status_code returned a 200, which means your request was successful and the server responded with the data you were requesting.

Sometimes, you might want to use this information to make decisions in your code:

if response.status_code == 200:
    print('Success!')
elif response.status_code == 404:
    print('Not Found.')

With this logic, if the server returns a 200 status code, your program will print Success!. If the result is a 404, your program will print Not Found.

requests goes one step further in simplifying this process for you. If you use a Responseinstance in a conditional expression, it will evaluate to True if the status code was between 200 and 400, and False otherwise.

Therefore, you can simplify the last example by rewriting the if statement:

if response:
    print('Success!')
else:
    print('An error has occurred.')

Keep in mind that this method is not verifying that the status code is equal to 200. The reason for this is that other status codes within the 200 to 400 range, such as 204 NO CONTENTand 304 NOT MODIFIED, are also considered successful in the sense that they provide some workable response.

For example, the 204 tells you that the response was successful, but there’s no content to return in the message body.

So, make sure you use this convenient shorthand only if you want to know if the request was generally successful and then, if necessary, handle the response appropriately based on the status code.

Let’s say you don’t want to check the response’s status code in an if statement. Instead, you want to raise an exception if the request was unsuccessful. You can do this using .raise_for_status():

import requests
from requests.exceptions import HTTPError

for url in ['https://api.github.com', 'https://api.github.com/invalid']:
    try:
        response = requests.get(url)

        # If the response was successful, no Exception will be raised
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')  # Python 3.6
    except Exception as err:
        print(f'Other error occurred: {err}')  # Python 3.6
    else:
        print('Success!')

If you invoke .raise_for_status(), an HTTPError will be raised for certain status codes. If the status code indicates a successful request, the program will proceed without that exception being raised.

Now, you know a lot about how to deal with the status code of the response you got back from the server. However, when you make a GET request, you rarely only care about the status code of the response. Usually, you want to see more. Next, you’ll see how to view the actual data that the server sent back in the body of the response.

Content

The response of a GET request often has some valuable information, known as a payload, in the message body. Using the attributes and methods of Response, you can view the payload in a variety of different formats.

To see the response’s content in bytes, you use .content:

>>>
>>> response = requests.get('https://api.github.com')
>>> response.content
b'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'

While .content gives you access to the raw bytes of the response payload, you will often want to convert them into a string using a character encoding such as UTF-8response will do that for you when you access .text:

>>>
>>> response.text
'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'

Because the decoding of bytes to a str requires an encoding scheme, requests will try to guess the encoding based on the response’s headers if you do not specify one. You can provide an explicit encoding by setting .encoding before accessing .text:

>>>
>>> response.encoding = 'utf-8' # Optional: requests infers this internally
>>> response.text
'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'

If you take a look at the response, you’ll see that it is actually serialized JSON content. To get a dictionary, you could take the str you retrieved from .text and deserialize it using json.loads(). However, a simpler way to accomplish this task is to use .json():

>>>
>>> response.json()
{'current_user_url': 'https://api.github.com/user', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}', 'authorizations_url': 'https://api.github.com/authorizations', 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}', 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}', 'emails_url': 'https://api.github.com/user/emails', 'emojis_url': 'https://api.github.com/emojis', 'events_url': 'https://api.github.com/events', 'feeds_url': 'https://api.github.com/feeds', 'followers_url': 'https://api.github.com/user/followers', 'following_url': 'https://api.github.com/user/following{/target}', 'gists_url': 'https://api.github.com/gists{/gist_id}', 'hub_url': 'https://api.github.com/hub', 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}', 'issues_url': 'https://api.github.com/issues', 'keys_url': 'https://api.github.com/user/keys', 'notifications_url': 'https://api.github.com/notifications', 'organization_repositories_url': 'https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}', 'organization_url': 'https://api.github.com/orgs/{org}', 'public_gists_url': 'https://api.github.com/gists/public', 'rate_limit_url': 'https://api.github.com/rate_limit', 'repository_url': 'https://api.github.com/repos/{owner}/{repo}', 'repository_search_url': 'https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}', 'current_user_repositories_url': 'https://api.github.com/user/repos{?type,page,per_page,sort}', 'starred_url': 'https://api.github.com/user/starred{/owner}{/repo}', 'starred_gists_url': 'https://api.github.com/gists/starred', 'team_url': 'https://api.github.com/teams', 'user_url': 'https://api.github.com/users/{user}', 'user_organizations_url': 'https://api.github.com/user/orgs', 'user_repositories_url': 'https://api.github.com/users/{user}/repos{?type,page,per_page,sort}', 'user_search_url': 'https://api.github.com/search/users?q={query}{&page,per_page,sort,order}'}

The type of the return value of .json() is a dictionary, so you can access values in the object by key.

You can do a lot with status codes and message bodies. But, if you need more information, like metadata about the response itself, you’ll need to look at the response’s headers.

Headers

The response headers can give you useful information, such as the content type of the response payload and a time limit on how long to cache the response. To view these headers, access .headers:

>>>
>>> response.headers
{'Server': 'GitHub.com', 'Date': 'Mon, 10 Dec 2018 17:49:54 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Status': '200 OK', 'X-RateLimit-Limit': '60', 'X-RateLimit-Remaining': '59', 'X-RateLimit-Reset': '1544467794', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept', 'ETag': 'W/"7dc470913f1fe9bb6c7355b50a0737bc"', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': 'E439:4581:CF2351:1CA3E06:5C0EA741'}

.headers returns a dictionary-like object, allowing you to access header values by key. For example, to see the content type of the response payload, you can access Content-Type:

>>>
>>> response.headers['Content-Type']
'application/json; charset=utf-8'

There is something special about this dictionary-like headers object, though. The HTTP spec defines headers to be case-insensitive, which means we are able to access these headers without worrying about their capitalization:

>>>
>>> response.headers['content-type']
'application/json; charset=utf-8'

Whether you use the key 'content-type' or 'Content-Type', you’ll get the same value.

Now, you’ve learned the basics about Response. You’ve seen its most useful attributes and methods in action. Let’s take a step back and see how your responses change when you customize your GET requests.

Query String Parameters

One common way to customize a GET request is to pass values through query stringparameters in the URL. To do this using get(), you pass data to params. For example, you can use GitHub’s Search API to look for the requests library:

import requests

# Search GitHub's repositories for requests
response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
)

# Inspect some attributes of the `requests` repository
json_response = response.json()
repository = json_response['items'][0]
print(f'Repository name: {repository["name"]}')  # Python 3.6+
print(f'Repository description: {repository["description"]}')  # Python 3.6+

By passing the dictionary {'q': 'requests+language:python'} to the params parameter of .get(), you are able to modify the results that come back from the Search API.

You can pass params to get() in the form of a dictionary, as you have just done, or as a list of tuples:

>>>
>>> requests.get(
...     'https://api.github.com/search/repositories',
...     params=[('q', 'requests+language:python')],
... )
<Response [200]>

You can even pass the values as bytes:

>>>
>>> requests.get(
...     'https://api.github.com/search/repositories',
...     params=b'q=requests+language:python',
... )
<Response [200]>

Query strings are useful for parameterizing GET requests. You can also customize your requests by adding or modifying the headers you send.

Request Headers

To customize headers, you pass a dictionary of HTTP headers to get() using the headersparameter. For example, you can change your previous search request to highlight matching search terms in the results by specifying the text-match media type in the Accept header:

import requests

response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
    headers={'Accept': 'application/vnd.github.v3.text-match+json'},
)

# View the new `text-matches` array which provides information
# about your search term within the results
json_response = response.json()
repository = json_response['items'][0]
print(f'Text matches: {repository["text_matches"]}')

The Accept header tells the server what content types your application can handle. In this case, since you’re expecting the matching search terms to be highlighted, you’re using the header value application/vnd.github.v3.text-match+json, which is a proprietary GitHub Accept header where the content is a special JSON format.

Before you learn more ways to customize requests, let’s broaden the horizon by exploring other HTTP methods.

Other HTTP Methods

Aside from GET, other popular HTTP methods include POSTPUTDELETEHEADPATCH, and OPTIONSrequests provides a method, with a similar signature to get(), for each of these HTTP methods:

>>>
>>> requests.post('https://httpbin.org/post', data={'key':'value'})
>>> requests.put('https://httpbin.org/put', data={'key':'value'})
>>> requests.delete('https://httpbin.org/delete')
>>> requests.head('https://httpbin.org/get')
>>> requests.patch('https://httpbin.org/patch', data={'key':'value'})
>>> requests.options('https://httpbin.org/get')

Each function call makes a request to the httpbin service using the corresponding HTTP method. For each method, you can inspect their responses in the same way you did before:

>>>
>>> response = requests.head('https://httpbin.org/get')
>>> response.headers['Content-Type']
'application/json'

>>> response = requests.delete('https://httpbin.org/delete')
>>> json_response = response.json()
>>> json_response['args']
{}

Headers, response bodies, status codes, and more are returned in the Response for each method. Next you’ll take a closer look at the POSTPUT, and PATCH methods and learn how they differ from the other request types.

The Message Body

According to the HTTP specification, POSTPUT, and the less common PATCH requests pass their data through the message body rather than through parameters in the query string. Using requests, you’ll pass the payload to the corresponding function’s data parameter.

data takes a dictionary, a list of tuples, bytes, or a file-like object. You’ll want to adapt the data you send in the body of your request to the specific needs of the service you’re interacting with.

For example, if your request’s content type is application/x-www-form-urlencoded, you can send the form data as a dictionary:

>>>
>>> requests.post('https://httpbin.org/post', data={'key':'value'})
<Response [200]>

You can also send that same data as a list of tuples:

>>>
>>> requests.post('https://httpbin.org/post', data=[('key', 'value')])
<Response [200]>

If, however, you need to send JSON data, you can use the json parameter. When you pass JSON data via jsonrequests will serialize your data and add the correct Content-Typeheader for you.

httpbin.org is a great resource created by the author of requestsKenneth Reitz. It’s a service that accepts test requests and responds with data about the requests. For instance, you can use it to inspect a basic POST request:

>>>
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> json_response = response.json()
>>> json_response['data']
'{"key": "value"}'
>>> json_response['headers']['Content-Type']
'application/json'

You can see from the response that the server received your request data and headers as you sent them. requests also provides this information to you in the form of a PreparedRequest.

Inspecting Your Request

When you make a request, the requests library prepares the request before actually sending it to the destination server. Request preparation includes things like validating headers and serializing JSON content.

You can view the PreparedRequest by accessing .request:

>>>
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'})
>>> response.request.headers['Content-Type']
'application/json'
>>> response.request.url
'https://httpbin.org/post'
>>> response.request.body
b'{"key": "value"}'

Inspecting the PreparedRequest gives you access to all kinds of information about the request being made such as payload, URL, headers, authentication, and more.

So far, you’ve made a lot of different kinds of requests, but they’ve all had one thing in common: they’re unauthenticated requests to public APIs. Many services you may come across will want you to authenticate in some way.

Authentication

Authentication helps a service understand who you are. Typically, you provide your credentials to a server by passing data through the Authorization header or a custom header defined by the service. All the request functions you’ve seen to this point provide a parameter called auth, which allows you to pass your credentials.

One example of an API that requires authentication is GitHub’s Authenticated User API. This endpoint provides information about the authenticated user’s profile. To make a request to the Authenticated User API, you can pass your GitHub username and password in a tuple to get():

>>>
>>> from getpass import getpass
>>> requests.get('https://api.github.com/user', auth=('username', getpass()))
<Response [200]>

The request succeeded if the credentials you passed in the tuple to auth are valid. If you try to make this request with no credentials, you’ll see that the status code is 401 Unauthorized:

>>>
>>> requests.get('https://api.github.com/user')
<Response [401]>

When you pass your username and password in a tuple to the auth parameter, requests is applying the credentials using HTTP’s Basic access authentication scheme under the hood.

Therefore, you could make the same request by passing explicit Basic authentication credentials using HTTPBasicAuth:

>>>
>>> from requests.auth import HTTPBasicAuth
>>> from getpass import getpass
>>> requests.get(
...     'https://api.github.com/user',
...     auth=HTTPBasicAuth('username', getpass())
... )
<Response [200]>

Though you don’t need to be explicit for Basic authentication, you may want to authenticate using another method. requests provides other methods of authentication out of the box such as HTTPDigestAuth and HTTPProxyAuth.

You can even supply your own authentication mechanism. To do so, you must first create a subclass of AuthBase. Then, you implement __call__():

import requests
from requests.auth import AuthBase

class TokenAuth(AuthBase):
    """Implements a custom authentication scheme."""

    def __init__(self, token):
        self.token = token

    def __call__(self, r):
        """Attach an API token to a custom auth header."""
        r.headers['X-TokenAuth'] = f'{self.token}'  # Python 3.6+
        return r


requests.get('https://httpbin.org/get', auth=TokenAuth('12345abcde-token'))

Here, your custom TokenAuth mechanism receives a token, then includes that token in the X-TokenAuth header of your request.

Bad authentication mechanisms can lead to security vulnerabilities, so unless a service requires a custom authentication mechanism for some reason, you’ll always want to use a tried-and-true auth scheme like Basic or OAuth.

While you’re thinking about security, let’s consider dealing with SSL Certificates using requests.

SSL Certificate Verification

Any time the data you are trying to send or receive is sensitive, security is important. The way that you communicate with secure sites over HTTP is by establishing an encrypted connection using SSL, which means that verifying the target server’s SSL Certificate is critical.

The good news is that requests does this for you by default. However, there are some cases where you might want to change this behavior.

If you want to disable SSL Certificate verification, you pass False to the verify parameter of the request function:

>>>
>>> requests.get('https://api.github.com', verify=False)
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
<Response [200]>

requests even warns you when you’re making an insecure request to help you keep your data safe!

Performance

When using requests, especially in a production application environment, it’s important to consider performance implications. Features like timeout control, sessions, and retry limits can help you keep your application running smoothly.

Timeouts

When you make an inline request to an external service, your system will need to wait upon the response before moving on. If your application waits too long for that response, requests to your service could back up, your user experience could suffer, or your background jobs could hang.

By default, requests will wait indefinitely on the response, so you should almost always specify a timeout duration to prevent these things from happening. To set the request’s timeout, use the timeout parameter. timeout can be an integer or float representing the number of seconds to wait on a response before timing out:

>>>
>>> requests.get('https://api.github.com', timeout=1)
<Response [200]>
>>> requests.get('https://api.github.com', timeout=3.05)
<Response [200]>

In the first request, the request will timeout after 1 second. In the second request, the request will timeout after 3.05 seconds.

You can also pass a tuple to timeout with the first element being a connect timeout (the time it allows for the client to establish a connection to the server), and the second being a read timeout (the time it will wait on a response once your client has established a connection):

>>>
>>> requests.get('https://api.github.com', timeout=(2, 5))
<Response [200]>

If the request establishes a connection within 2 seconds and receives data within 5 seconds of the connection being established, then the response will be returned as it was before. If the request times out, then the function will raise a Timeout exception:

import requests
from requests.exceptions import Timeout

try:
    response = requests.get('https://api.github.com', timeout=1)
except Timeout:
    print('The request timed out')
else:
    print('The request did not time out')

Your program can catch the Timeout exception and respond accordingly.

The Session Object

Until now, you’ve been dealing with high level requests APIs such as get() and post(). These functions are abstractions of what’s going on when you make your requests. They hide implementation details such as how connections are managed so that you don’t have to worry about them.

Underneath those abstractions is a class called Session. If you need to fine-tune your control over how requests are being made or improve the performance of your requests, you may need to use a Session instance directly.

Sessions are used to persist parameters across requests. For example, if you want to use the same authentication across multiple requests, you could use a session:

import requests
from getpass import getpass

# By using a context manager, you can ensure the resources used by
# the session will be released after use
with requests.Session() as session:
    session.auth = ('username', getpass())

    # Instead of requests.get(), you'll use session.get()
    response = session.get('https://api.github.com/user')

# You can inspect the response just like you did before
print(response.headers)
print(response.json())

Each time you make a request with session, once it has been initialized with authentication credentials, the credentials will be persisted.

The primary performance optimization of sessions comes in the form of persistent connections. When your app makes a connection to a server using a Session, it keeps that connection around in a connection pool. When your app wants to connect to the same server again, it will reuse a connection from the pool rather than establishing a new one.

Max Retries

When a request fails, you may want your application to retry the same request. However, requests will not do this for you by default. To apply this functionality, you need to implement a custom Transport Adapter.

Transport Adapters let you define a set of configurations per service you’re interacting with. For example, let’s say you want all requests to https://api.github.com to retry three times before finally raising a ConnectionError. You would build a Transport Adapter, set its max_retries parameter, and mount it to an existing Session:

import requests
from requests.adapters import HTTPAdapter
from requests.exceptions import ConnectionError

github_adapter = HTTPAdapter(max_retries=3)

session = requests.Session()

# Use `github_adapter` for all requests to endpoints that start with this URL
session.mount('https://api.github.com', github_adapter)

try:
    session.get('https://api.github.com')
except ConnectionError as ce:
    print(ce)

When you mount the HTTPAdaptergithub_adapter, to sessionsession will adhere to its configuration for each request to https://api.github.com.

Timeouts, Transport Adapters, and sessions are for keeping your code efficient and your application resilient.

Conclusion

You’ve come a long way in learning about Python’s powerful requests library.

You’re now able to:

  • Make requests using a variety of different HTTP methods such as GETPOST, and PUT
  • Customize your requests by modifying headers, authentication, query strings, and message bodies
  • Inspect the data you send to the server and the data the server sends back to you
  • Work with SSL Certificate verification
  • Use requests effectively using max_retriestimeout, Sessions, and Transport Adapters

Because you learned how to use requests, you’re equipped to explore the wide world of web services and build awesome applications using the fascinating data they provide.


▶ 파이썬은 보통 가상환경을 사용하기 때문에, 프로젝트 내부에 모듈을 가지고 있지 않다.
 


▶ 파이썬은 멀티프로세스 멀티 스레드 언어이다

 자바스크립트는 단일프로세스 단일 스레드, 비동기 언어이지만, 파이썬은 멀티프로세스, 멀티스레드, 동기언어이다. 때문에, 다른 시간대의 움직임을 구현하려면(예를들어 setTimeout, timer같은) 자바스크립트는 그냥 되지만(비동기이기 때문에), 파이썬은 스레드를 쪼개줘야한다.



▶ 딕트 안에 키가 존재하는지 아닌지 확인하려면 : if key in dict 

  if dict[key] 문법으로 하면 에러가 발생하지만, if key in dict로 하면 true false를 반환하게 된다.(https://stackoverflow.com/questions/1602934/check-if-a-given-key-already-exists-in-a-dictionary)



▶ 에러는 call된 상위 모듈에 throw/raise로 올리지 않으면, 상위 모듈의 catch에서처리 되지 않는다.

 에러 로직을 구현할때 몇가지 알아야할 것이 있다. 가령, valueError로 raise가 된 에러에 대해 Exception으로 덮어 씌운다면 error가 불분명해지므로 좋지 않은 코딩이 된다. 이에 관련한건 아래  url에서 참고할 것
(https://stackoverflow.com/questions/2052390/manually-raising-throwing-an-exception-in-python)



추상클래스에서 부모의 초기화를 상속받은 자식이하기 :  super(SubClass,self).__init__( x )

 https://stackoverflow.com/questions/3694371/how-do-i-initialize-the-base-super-class



▶ python request 객체에 관해

 https://valuefactory.tistory.com/524?category=765138



▶ join사용하지 않고 windows함수를 사용해 max_date 셀렉트하기

  https://stackoverflow.com/questions/19432913/select-info-from-table-where-row-has-max-date



▶ psycopg2모듈은 별도로 connection.commit()하지 않으면 커밑하지 않음. 외에 auto commit설정하면 가능함

 https://stackoverflow.com/questions/13715743/psycopg2-not-actually-inserting-data



▶ Exception을 그냥 무시하고 지나가는법 : pass

  https://stackoverflow.com/questions/574730/python-how-to-ignore-an-exception-and-proceed



▶  파이썬에서는 if not condition: 자바스크립트에서는 if (! condition) :



▶ 클래스, 함수 파라미터 설명문

https://stackoverflow.com/questions/34331088/how-to-comment-parameters-for-pydoc

"""
This example module shows various types of documentation available for use
with pydoc.  To generate HTML documentation for this module issue the
command:

    pydoc -w foo

"""

class Foo(object):
    """
    Foo encapsulates a name and an age.
    """
    def __init__(self, name, age):
        """
        Construct a new 'Foo' object.

        :param name: The name of foo
        :param age: The ageof foo
        :return: returns nothing
        """
        self.name = name
        self.age

def bar(baz):
    """
    Prints baz to the display.
    """
    print baz

if __name__ == '__main__':
    f = Foo('John Doe', 42)
    bar("hello world")



▶ 상위 디렉토리의 모듈을 임포트하는 방법들

 https://valuefactory.tistory.com/525?category=765138



▶ 파이썬 프로젝트 구성

 파이썬 프로젝트 구조 : https://python-guideja.readthedocs.io/ja/latest/writing/structure.html

https://www.holaxprogramming.com/2017/06/28/python-project-structures/


▶ 파이썬 프로젝트를 배포하면 소스코드 디렉토리명이 모듈명이 됨으로 주의할 것


파이썬에서는 프로젝트 디렉토리에 만들었던 소스코드 디렉토리 명이 모듈명이 된다는 것이다. 때문에 보통 project -> 어플리케이션(패키지명) 아래에 소스코드를 배치하는 구성을 취한다.



nodejs프로젝트를 배포해본 경험이 있는 사람이라면 알겠지만, nodejs에서는


「package.json안에 있는 main키를 보고 -> 패키지가 import되면 -> main에 적힌 파일을 실행」과 같은 흐름이다.



하지만, python에서는


「__init__.py파일이 있는 디렉토리를 보고, 그 디렉토리명을 모듈명으로 인식 」의 흐름으로 실행되서, 


project직하에 있는 소스코드를 모아두는 폴더명에도 신경써야 한다. nodejs처럼 src로 했다가는 import src...와 같은 모양새가 되는 것이다.



▶ [django]view에서 form 객체를 template으로 보내는 법 

https://www.journaldev.com/22424/django-forms


Add the following code in your forms.py file:


from django import forms

class MyForm(forms.Form):
 name = forms.CharField(label='Enter your name', max_length=100)
 email = forms.EmailField(label='Enter your email', max_length=100)
 feedback = forms.CharField(widget=forms.Textarea(attrs={'width':"100%", 'cols' : "80", 'rows': "20", }))

We’ve added three fields: CharFields, EmailFields, and a CharField with TextArea width and height specified.

The code for views.py file is given below:


from django.shortcuts import render
from responseapp.forms import MyForm

def responseform(request):
     form = MyForm()

     return render(request, 'responseform.html', {'form':form});




▶ pdb로 디버깅하기

http://racchai.hatenablog.com/entry/2016/05/30/070000


 import pdb; pdb.set_trace()



▶ [django] 동적인 정보를 template에서 url 디스패처로 보내기


...
<form action="{% url 'simulation:result' domain %}" method="post">
... <!-- 이때 simulation은 어떤 app_name, domain은 path의 name



app_name = 'simulation'
urlpatterns = [
path('hcheck', views.get_hcheck, name='hcheck'),
path('<str:domain>/', views.get_page, name='index'),
path('result/<str:domain>/', views.get_graphNTable, name='result'),
path('renew/<str:domain>/<str:sId>/', views.renew_page, name='renew'),
]


▶ 파이썬 3.6이상에서는 dict가 순서가 정해지게 되었음

https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6/39980744


Are dictionaries ordered in Python 3.6+?

They are insertion ordered[1]. As of Python 3.6, for the CPython implementation of Python, dictionaries remember the order of items insertedThis is considered an implementation detail in Python 3.6; you need to use OrderedDict if you want insertion ordering that's guaranteed across other implementations of Python (and other ordered behavior[1]).

As of Python 3.7, this is no longer an implementation detail and instead becomes a language feature. From a python-dev message by GvR:

Make it so. "Dict keeps insertion order" is the ruling. Thanks!

This simply means that you can depend on it. Other implementations of Python must also offer an insertion ordered dictionary if they wish to be a conforming implementation of Python 3.7.



▶ default

 defaul


▶ default

 defaul



▶ default

 defaul



▶ default

 defaul



▶ default

 defaul


▶ default

 defaul



▶ default

 defaul



▶ default

 defaul



▶ default

 defaul


▶ default

 defaul



▶ default

 defaul



▶ default

 defaul



▶ default

 defaul


▶ default

 defaul



▶ default

 defaul



▶ default

 defaul



▶ default

 defaul


▶ default

 defaul



▶ default

 defaul



▶ default

 defaul



▶ default

 defaul










































この記事は最終更新日から1年以上が経過しています。

logging入門

はじめに

Pythonのloggingについて検索すると、qiitaの記事だとhttps://qiita.com/amedama/items/b856b2f30c2f38665701が上位にくるんですが、正直わかりにくくなっているだけだと思います。本記事では、基本の使い方と、実際の有名所のライブラリのloggingの活用例(3つ)の2部構成で説明していきたいです。

対象者

  • ライブラリを作成するのに、loggingを活用したい人

  • そろそろprintデバッグやめたいけど、loggingの玉石混交の解説を見て、半ば諦め気味の人. どれが本当に正しい説明なのか、あんまり良くわかってない人

本記事の特長

  • いろんなブログの説明や記事を見ると、オレオレ視点でこうしよう、みたいに書かれている内容のものが多すぎる1ので、可能な限り公式の引用元を明示した。

  • 本記事の「基本の使い方」の説明をカバーするために、有名所の外部ライブラリ(gensim.word2vec, requests, google-api-python-client)が実際どのようにloggingを使っているかと、これらのライブラリのloggingを使用者側(ユーザー側)がどのように扱えばよいかを書いた。

基本の使い方

原点に立ち返るという意味で、loggingのことが書いてあるPEP282に基づいて説明します。

まず、ライブラリ側とそれを使う側(ユーザー側)でloggingの書き方が違います。ここをごっちゃにすると、話が絶対に錯綜するので、必ず意識するようにします。
ライブラリ側では、module毎にloggerを定義しておいて、logとして出したい情報に関して、debug, info, (warning)等を使い分けます2

ライブラリ側

ライブラリ側では、先頭にlogger = logging.getLogger(__name__)を指定して3、loggingしたい場所にlogger.debug(msg)みたいなふうに書くだけです。(__name__は、ライブラリのモジュールの名前です。詳しくは、インポート関連のモジュール属性 を参照のこと)。
細かな出力方法はユーザー側で設定するので、それ以外のコーディングは必要ないです4

mod.py
import logging
# __name__はこのモジュールの名前
logger = logging.getLogger(__name__)

class Hoge():
    def __init__(self):
        pass
    def do(self):
        # do sth
        # おもに問題を診断するときにのみ関心があるような、詳細な情報。
        logger.debug("test")
        # 想定された通りのことが起こったことの確認。
        logger.info("info")

def do():
  # do sth
  logger.debug("testtest")

簡潔ですね!

ユーザー側

ライブラリ側とは別に、ユーザー側でのloggingの記述の説明に移りたいと思います。ここで、ユーザー側とはライブラリのdeveloperではなく、ライブラリの使用者のことを指すことにします。
ユーザー側では、(必要ならば)ロギングレベルや出力ファイル、モジュールの指定ができます。全く何も書かなければ、warning以上のlogがsys.stderrに吐き出されます。
上記で説明したように、この部分をライブラリ側で実装すべきでないです。
以下、よく使うloggingのパターンについて見ていきます5

説明のために、こんなようなライブラリ構造を想定します。

ライブラリの構造(例)
repo_name
  │ 
  ├── my_pac
  │   ├── __init__.py
  │   ├── mod01.py
  │   ├── mod02.py
  │   └── mod03.py
  ├──  setup.py
  ├──  README.md
  └── .gitignore

my_pacのmod01のみに絞ってsys.stderrにlogを出す場合

test.pyやtest.ipynbなど
import logging
# my_package.my_moduleのみに絞ってsys.stderrにlogを出す
logging.basicConfig() # 最初に呼び出しが必要(注釈も参考のこと)。 defaultはlevel=logging.WARNING
logging.getLogger("my_pac.mod01").setLevel(level=logging.INFO)

このように指定することで、logging.INFO以上の、つまり、logging.INFO, logging.WARNING, logging.ERROR(, logging.CRITICAL)のレベルに当てはまるlogがmy_package.my_moduleに関して出力されます。(setLevelにlogging.DEBUGを指定すれば、logging.DEBUGのものも出力してくれます)6

念の為ですが、from my_pac import mod01してようがしてまいが、my_pac.mod01という風に、パッケージ名を省略せずに書いてください。

複数のモジュール(この場合は、my_pacのmod01, mod02)に対して、logを出したい

test.pyやtest.ipynbなど
import logging
logging.basicConfig() # defaultはlevel=logging.WARNING
module_levels = {"my_pac.mod01": logging.DEBUG, "my_pac.mod02": logging.INFO}
for module, level in module_levels.items():
    logging.getLogger(module).setLevel(level=level)

とにかく、logging.basicConfig()の後に、所要のmoduleに関して、logging.getLogger(module).setLevel(level=level)を書けば良いです。

my_pac以下のすべてのmodule(mod01, mod02, mod03)でlogを出したい

test.pyやtest.ipynbなど
import logging
logging.basicConfig(level=logging.ERROR) # WARNINGレベルも出してほしくないときには、levelオプション指定すれば良い。

# my_package以下のすべてのmoduleでDEBUGレベルのlogを出す。(それ以外のmoduleではロギングレベルをERRORと設定しているので、logは出ない)
logging.getLogger("my_pac").setLevel(level=logging.DEBUG)

Note) これらは、init.pyに指定する__all__とは、全く何も関係ありません。(そもそも__all__とは、from my_pac import *を(アスタリスクを用いて)書いたときに何をimportするかを指定するための変数でした)

logging.DEBUGでlogを出したいが、あるモジュールのみ(例えば、my_pac.mod01のみ)loggingを抑制したい(ロギングレベルをlogging.ERRORにあげる)

test.pyやtest.ipynbなど
import logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger("my_pac.mod01").setLevel(level=logging.ERROR)

logの内容をlogファイルに書き込む(DEBUG levelで)と同時にsys.stderrにも(INFO levelで)書き込む

ここで注意したいのが、ファイルにはすべてのmoduleのlogを出力したいので、basicConfigfilenameformatの設定をする必要があるということです7。sys.stderrのloggerの設定はその後に個別で対応します。

test.pyやtest.ipynbなど
import logging
# 書き方は、https://docs.python.jp/3/library/logging.html#logrecord-attributes を参照のこと。
_detail_formatting = "%(relativeCreated)08d[ms] - %(name)s - %(levelname)s - %(processName)-10s - %(threadName)s -\n*** %(message)s"

# まずはファイルの設定 logは詳細に取りたい
logging.basicConfig(
    level=logging.DEBUG,
    format=_detail_formatting, # 出力のformatも変えられる
    filename="./sample.log", # logファイルのありか
)

# つづいて、sys.stderrのloggerの設定
# http://docs.python-guide.org/en/latest/writing/logging/#example-configuration-directly-in-code も参考になる

# log出力先をsys.stderrに
console = logging.StreamHandler()
# 個別にformatの形式を変えられる
console_formatter = logging.Formatter("%(relativeCreated)07d[ms] : %(name)s : %(message)s")
console.setFormatter(console_formatter)
# sys.stderrにはざっくりとしたerror情報で良いので、INFOとする
console.setLevel(logging.INFO)
# consoleという設定logging設定ができたので、適用したいmoduleに対して、addHandlerにその設定を突っ込む
logging.getLogger("my_pac.mod01").addHandler(console)
# 複数のモジュールでlogを出したい場合はこうする(`複数のモジュールでlogを出したい`の例の通り)
# logging.getLogger("my_pac.mod02").addHandler(console)

Note) loggingでファイルに書き込む場合は、デフォルトでa(追記:append)です。これはloggingの性質上当たり前8なのですが、open関数とかのmodeのデフォルト値と違っているので、ちょっとだけ注意です。


よく使うのは、こんなかんじですかね?

追記) ライブラリ側とユーザー側が1ファイルにまとまっている場合、つまり、if __name__ == "__main__":以下でスクリプトとして実行する場合も考え方は同じです。詳しくは、
https://gist.github.com/podhmo/7fcc1024dec64b42074c3d3d5c789f98
をみてください。

ここでは、有名なライブラリがどのようにloggingを使っているかと、どのようにこれらのlogを扱えばよいのかを解説します。
普通のやつ、loggingがないやつ、めんどくさいやつの3つ紹介します:eye:

gensim.word2vec

まずは普通のやつから。自分がそれなりに使用している外部のlibraryでどんなふうにloggingが使われているか見ます。
gensim自体は巨大なライブラリ群になっているので、今回はword2vec.pyのコードを参照したいと思います9
=> https://github.com/RaRe-Technologies/gensim/blob/e92b45d3f83ad670da9cf6ae20ae86a2a1c8558c/gensim/models/word2vec.py 10

githubのリンク貼ってあるので、気になったら飛んでくださいね:airplane:

例えばですが、128行目 logger = logging.getLogger(__name__)を作っていて、
必要なところに、infoとかdebugとか挟んでいます(例えば、504-507行目)
user側(__main__部)がどのように実装されているかというと、1720-1722行目 にこんな感じで実装されていて、上記で述べたloggingの使い方と合致してます。

requests

お次は、みんな一度は使用したことがあるであろうrequestsライブラリ。これは、公式のurllib3のラッパーです。
requests本体には、loggingは含まれていません11。なので、logging.getLogger("requests").setLevel(level=logging.DEBUG)と書いても何も起こりません。urllib3由来のlogを吐き出したい場合は、以下の設定をユーザー側で行います。

test.pyやtest.ipynbなど
import logging
logging.basicConfig()
logging.getLogger("urllib3").setLevel(level=logging.DEBUG)

# put in your code!
import requests
r = requests.get('http://google.com/')

を実行することで、

sys.stderr
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): google.com
DEBUG:urllib3.connectionpool:http://google.com:80 "GET / HTTP/1.1" 302 268
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.google.co.jp
DEBUG:urllib3.connectionpool:http://www.google.co.jp:80 "GET /?gfe_rd=cr&dcr=0&ei=M7PoWZDjG7DEXqujgGA HTTP/1.1" 200 5207

がlogとして出力されます。

または、公式でsys.stdoutに詳細な情報を出力してくれるようにしているみたいで12

test.pyやtest.ipynbなど
# Enabling debugging at http.client level (requests->urllib3->http.client)
# you will see the REQUEST, including HEADERS and DATA, and RESPONSE with HEADERS but without DATA.
# the only thing missing will be the response.body which is not logged.
try: # for Python 3
    from http.client import HTTPConnection
except ImportError:
    from httplib import HTTPConnection
HTTPConnection.debuglevel = 1

# put in your code!
import requests
r = requests.get('http://google.com/')

とすることで、

sys.stdout
send: b'GET / HTTP/1.1\r\nHost: google.com\r\nUser-Agent: python-requests/2.18.4\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
header: Cache-Control header: Content-Type header: Referrer-Policy header: Location header: Content-Length header: Date send: b'GET /?gfe_rd=cr&dcr=0&ei=ubLoWZ3fFbHEXpy5jagC HTTP/1.1\r\nHost: www.google.co.jp\r\nUser-Agent: python-requests/2.18.4\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date header: Expires header: Cache-Control header: Content-Type header: P3P header: Content-Encoding header: Server header: Content-Length header: X-XSS-Protection header: X-Frame-Options header: Set-Cookie header: Set-Cookie 

という情報を出してくれます。(ただ、これはsys.stderrに吐き出される情報でないし、loggingモジュールで管理できないので、どうかな?というのはあります)

google-api-python-client

続いては、google-api-python-clientです。こちらは
INFOレベル以上のログ出力を指定してもログが結構大量に出てきてしまう、logging的には始末が悪いライブラリです13

test.pyやtest.ipynbなど
import apiclient, httplib2
import logging
_detail_format = "%(asctime)s - %(levelname)s - line %(lineno)d - %(name)s - %(filename)s - \n*** %(message)s"
logging.basicConfig(
    format=_detail_format,
    level=logging.DEBUG,
) # defaultはlevel=logging.WARNING

h = httplib2.Http()
refresh_token = "aaaaaaaaaaaaaaaaaa" #適切なものを指定
h.default_headers = {"x-oauth-refreshtoken" : refresh_token}
service = apiclient.discovery.build("analytics", "v3", http=h)
# この後、queryにdict型のデータを突っ込んでレスポンスを得る
# response = service.data().ga().get(**query).execute()

たとえば、これを実行すると、私の環境では、oauth2clientのバージョンが新しいため、importすべきものがないよ、みたいなwarningが大量にでてきてしまいました14:

sys.stderr
2017-10-20 14:35:37,819 - WARNING - line 44 - googleapiclient.discovery_cache - __init__.py - 
*** file_cache is unavailable when using oauth2client >= 4.0.0
Traceback (most recent call last):
  File "/Users/knknkn/Documents/test/venv/lib/python3.6/site-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect
    from google.appengine.api import memcache
ModuleNotFoundError: No module named 'google'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/knknkn/Documents/test/venv/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module>
    from oauth2client.contrib.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/knknkn/Documents/test/venv/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module>
    from oauth2client.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/knknkn/Documents/test/venv/lib/python3.6/site-packages/googleapiclient/discovery_cache/__init__.py", line 41, in autodetect
    from . import file_cache
  File "/Users/knknkn/Documents/test/venv/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module>
    'file_cache is unavailable when using oauth2client >= 4.0.0')
ImportError: file_cache is unavailable when using oauth2client >= 4.0.0
2017-10-20 14:35:37,821 - INFO - line 274 - googleapiclient.discovery - discovery.py - 
*** URL being requested: GET https://www.googleapis.com/discovery/v1/apis/analytics/v3/rest

ちなみに、tracebackの情報もlogとともに出されるのは、googleapiclientのコードのこの部分にあるようにlogging.info/debugメソッドのexc_infoオプションをTrueにしているからです15

このimportの警告文はgoogleapiclient.discovery_cacheに起因しているみたいなので、これだけを抑制したいときは、ユーザー側で以下のように指定すれば良いです(基本の使い方 > ユーザー側 > loggingを抑制したいを参照のこと):

test.pyやtest.ipynbなど
import logging
logging.basicConfig(format="%(asctime)s - %(levelname)s - line %(lineno)d - %(name)s - %(filename)s - \n*** %(message)s") # defaultはlevel=logging.DEBUG)
logging.getLogger("googleapiclient.discovery_cache").setLevel(level=logging.ERROR) # ImportErrorがおびただしく出てくるので、ここのlogは不要
logging.getLogger("googleapiclient.discovery").setLevel(level=logging.DEBUG) # このログは出してほしい

まとめ

loggingに関しては、基本の書き方に沿って書けば大丈夫です。

参考

補足

本編で拾えなかったところを書き書きしてます。

logging.getLoggerの第一引数 について

getLoggerの第一引数はloggerにつける識別子で普通はモジュール名を入れます。なので、モジュールごとにlogger = logging.getLogger(__name__)を書く感じです(__name__はモジュールの名前を表します)。
つまり、(回りくどい言い方しないで)、何が言いたかったかというと、クラスレベルでloggerの識別をすることは普通はしないということです。

一応、やろうと思えば実現できて、

mod.py
class Hoge():
    @classmethod
    def get_logger(cls):
        return logging.getLogger("{}.{}".format(__name__, cls.__name__))
    def __init__(self):
        pass
    def do(self):
        # do sth
        # おもに問題を診断するときにのみ関心があるような、詳細な情報。
        self.get_logger().debug("hoge_test")

という感じで書けますけど、普通はmodule単位でログを出す感じですね..






먼저 프로젝트 구조가 아래와 같이 구성되어 있다고 가정합니다.
project	
	-- test
		+-- sub1
			-- __init__.py
			-- aa.py
			-- bb.py
		+-- sub2
			-- __init__.py
			-- cc.py
			-- dd.py
		-- ee.py
		-- ff.py
		-- __init__.py
	-- gg.py

ee.py에서 다른 모듈 참조 (하위 폴더 내 파일, 동일 폴더 내 파일 참조)

이와 같은 방법은 간단합니다.

# aa.py를 참조할 경우
from sub1 import aa
 
# ff.py를 참조할 경우
import ff 
# 또는
from . import ff # from . 은 동일폴더라는 의미를 나타냄.

상위 폴더 내 파일 참조

예를들어 aa.py에서 sub2에 있는 cc.py을 참조하는 방법은 두 가지가 있습니다.

1. 부모폴더의 절대경로를 참조 path에 추가

모듈의 시작부분 어플리케이션이 기동되는 가장 첫부분에 import에 아래와 같은 코드를 추가하면 문제는 해결됩니다.

# aa.py
 
import os
import sys
sys.path.append(os.path.dirname(os.path.abspath(os.path.dirname(__file__))))
 
import cc.py
현재 모듈의 절대경로를 알아내어 상위 폴더 절대경로를 참조 path에 추가하는 방식입니다. 위에 있는 코드는 1단계 상위 폴더의 경로를 추가할 때 사용합니다.
만약 aa.py에서 gg.py를 참조한다고 하면 2단계 상위 폴더 경로를 추가해야 하므로 아래와 같이 코드가 길어집니다.


# gg.py

import os
import sys

sys.path.append(os.path.dirname(os.path.abspath(os.path.dirname(os.path.abspath(os.path.dirname(__file__))))))

import gg.py


2. 시스템의 환경변수 PYTHONPATH에 프로젝트 추가

이 방법은 모든 파이썬 프로젝트에서 built-in 모듈을 그냥 import 할 수 있는 것과 마찬가지의 원리로 시스템의 파이썬 컴파일러가 기본적으로 참조하게 될 패키지 모듈에 자신의 프로젝트를 추가하는 형태입니다.

윈도우

제어판 - 시스템 - 고급 - 환경변수 에 가서 PYTHON_PATH를 편집하여 자신의 프로젝트 홈 폴더를 맨 뒤에 붙여주면 됩니다.

리눅스

추가할 프로젝트의 절대경로가 /home/user/project일 경우, 홈 폴더의 .bash_profile 에 아래와 같은 코드를 추가하여 시스템 환경변수를 변경하는 것입니다.

$ vi ~/bash_profile

========= bash_profile =========
...
PYTHONPATH=$PYTHONPATH:/home/user/test
export PYTHONPATH
============================
$ source ~/bash_profile


3. 포인트 (.)를 이용한 상대경로 탐색

../../ 와같은 상대경로 표시를 통해 상위 모듈에 접근하는 방법입니다. 

아래의 예를 통해, 현재 life.py에서 nib.py를 호출한다고하면, from ... import nib면 되겠네요.

참고 : https://stackoverflow.com/questions/714063/importing-modules-from-parent-folder


ptdraft/ nib.py simulations/ life/ life.py


from ... import nib














'C Lang > Python Program Diary' 카테고리의 다른 글

python program diary  (0) 2019.06.16
파이썬 logging入門  (0) 2019.06.14
Python requests 모듈 간단 정리  (0) 2019.06.14
파이썬 에러처리하기  (0) 2019.06.14
__enter__, __exit__, with 구문  (0) 2019.06.11

Python requests 모듈 간단 정리

Python에서 HTTP 요청을 보내는 모듈인 requests를 간단하게 정리하고자 한다.

0. 기본적인 사용 방법

import requests
URL = 'http://www.tistory.com'
response = requests.get(URL)
response.status_code
response.text

python-requests-get-example

웹브라우져에서 티스토리를 접속한 것과 똑같은 이야기이다. www.tistory.com 이라는 주소로 GET 요청(request)를 보냈고 서버에서는 그 요청을 받아 뭔가를 처리한 후 요청자인 나에게 응답(response)를 줬다. 우선 그 응답은 200 상태코드와 함께 왔다. 이는 티스토리 서버에서 잘 처리되어서 정상적인 응답을 보내줬다는 OK 싸인을 의미한다. 그리고 응답의 내용은? 보시다시피 HTML 코드.

1. GET 요청할 때 parameter 전달법

params = {'param1': 'value1', 'param2': 'value'}
res = requests.get(URL, params=params)

python-requests-post-example

응답 객체인 res를 통해서 내가 실제로 던진 URL이 뭔지 확인해보았다. 내가 준 URL과 파라미터를 requests 모듈이 엮어서 적절한 새로운 요청을 만든 것이다. 내가 직접 URL을 저렇게 타이핑하는 것보다 파라미터를 딕셔너리 형식으로 정리하고 requests 모듈에 던져주는 것이 훨씬 좋다고 생각한다.

2. POST 요청할 때 data 전달법

위의 내용과 같다, params 대신 data라는 이름으로 주면 된다.

data = {'param1': 'value1', 'param2': 'value'}
res = requests.post(URL, data=data)

조금 더 복잡한 구조로 POST 요청을 해야 할 때가 있다. 이럴 때는 위의 방법처럼 순진하게 주면 안된다. 우리가 인지하고 있는 그 딕셔너리의 구조를 유지하면서 문자열로 바꿔서 전달해줘야 하는데(?), python에서 이 노동을 해주는 친구가 json 모듈이다.

import requests, json
data = {'outer': {'inner': 'value'}}
res = requests.post(URL, data=json.dumps(data))

3. 헤더 추가, 쿠키 추가

별도의 헤더 옵션을 추가하고자 할 때는 headers 옵션을, 쿠키를 심어서 요청을 보내고 싶으면 cookies 옵션을 사용하면 된다

headers = {'Content-Type': 'application/json; charset=utf-8'}
cookies = {'session_id': 'sorryidontcare'}
res = requests.get(URL, headers=headers, cookies=cookies)

4. 응답(Response) 객체

요청(request)을 보내면 응답(response)을 받는다. 당연히 이 응답은 python 객체로 받는다. 그리고 이 응답 객체는 많은 정보와 기능을 가지고 있다. ipython이나 jupyter notebook에서 <탭> 기능을 이용해서 직접 체험해보면 금방 파악이 가능하지만 여기에 몇 가지만 기록하겠다.

python-requests-response-objectipython 환경에서 res.<탭>을 통해 어떤 요소 및 함수가 있는지 살펴볼 수 있다.


res.request # 내가 보낸 request 객체에 접근 가능
res.status_code # 응답 코드
res.text # body의 내용물을 text로 반환. encoding타입을 설정하는게 좋음
res.content # body를 byte로 변환
res.raise_for_status() # 200 OK 코드가 아닌 경우 에러 발동
res.json() # json response일 경우 딕셔너리 타입으로 바로 변환

추후에 시간이 된다면 SSL 인증에 관한 내용도 추가하도록 하고 싶다, 더 잘 알게 된다면.



출처: https://dgkim5360.tistory.com/entry/python-requests [개발새발로그]

'C Lang > Python Program Diary' 카테고리의 다른 글

파이썬 logging入門  (0) 2019.06.14
상위, 하위, 동일 폴더내의 모듈 import하는 법  (0) 2019.06.14
파이썬 에러처리하기  (0) 2019.06.14
__enter__, __exit__, with 구문  (0) 2019.06.11
문자열 format하기  (0) 2019.06.10

source : https://stackoverflow.com/questions/2052390/manually-raising-throwing-an-exception-in-python




How do I manually throw/raise an exception in Python?

Use the most specific Exception constructor that semantically fits your issue.

Be specific in your message, e.g.:

raise ValueError('A very specific bad thing happened.')

Don't raise generic exceptions

Avoid raising a generic Exception. To catch it, you'll have to catch all other more specific exceptions that subclass it.

Problem 1: Hiding bugs

raise Exception('I know Python!') # Don't! If you catch, likely to hide bugs.

For example:

def demo_bad_catch():
    try:
        raise ValueError('Represents a hidden bug, do not catch this')
        raise Exception('This is the exception you expect to handle')
    except Exception as error:
        print('Caught this error: ' + repr(error))

>>> demo_bad_catch()
Caught this error: ValueError('Represents a hidden bug, do not catch this',)

Problem 2: Won't catch

and more specific catches won't catch the general exception:

def demo_no_catch():
    try:
        raise Exception('general exceptions not caught by specific handling')
    except ValueError as e:
        print('we will not catch exception: Exception')


>>> demo_no_catch()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in demo_no_catch
Exception: general exceptions not caught by specific handling

Best Practices: raise statement

Instead, use the most specific Exception constructor that semantically fits your issue.

raise ValueError('A very specific bad thing happened')

which also handily allows an arbitrary number of arguments to be passed to the constructor:

raise ValueError('A very specific bad thing happened', 'foo', 'bar', 'baz') 

These arguments are accessed by the args attribute on the Exception object. For example:

try:
    some_code_that_may_raise_our_value_error()
except ValueError as err:
    print(err.args)

prints

('message', 'foo', 'bar', 'baz')    

In Python 2.5, an actual message attribute was added to BaseException in favor of encouraging users to subclass Exceptions and stop using args, but the introduction of message and the original deprecation of args has been retracted.

Best Practices: except clause

When inside an except clause, you might want to, for example, log that a specific type of error happened, and then re-raise. The best way to do this while preserving the stack trace is to use a bare raise statement. For example:

logger = logging.getLogger(__name__)

try:
    do_something_in_app_that_breaks_easily()
except AppError as error:
    logger.error(error)
    raise                 # just this!
    # raise AppError      # Don't do this, you'll lose the stack trace!

Don't modify your errors... but if you insist.

You can preserve the stacktrace (and error value) with sys.exc_info(), but this is way more error prone and has compatibility problems between Python 2 and 3, prefer to use a bare raise to re-raise.

To explain - the sys.exc_info() returns the type, value, and traceback.

type, value, traceback = sys.exc_info()

This is the syntax in Python 2 - note this is not compatible with Python 3:

    raise AppError, error, sys.exc_info()[2] # avoid this.
    # Equivalently, as error *is* the second object:
    raise sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]

If you want to, you can modify what happens with your new raise - e.g. setting new args for the instance:

def error():
    raise ValueError('oops!')

def catch_error_modify_message():
    try:
        error()
    except ValueError:
        error_type, error_instance, traceback = sys.exc_info()
        error_instance.args = (error_instance.args[0] + ' <modification>',)
        raise error_type, error_instance, traceback

And we have preserved the whole traceback while modifying the args. Note that this is not a best practice and it is invalid syntax in Python 3 (making keeping compatibility much harder to work around).

>>> catch_error_modify_message()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in catch_error_modify_message
  File "<stdin>", line 2, in error
ValueError: oops! <modification>

In Python 3:

    raise error.with_traceback(sys.exc_info()[2])

Again: avoid manually manipulating tracebacks. It's less efficient and more error prone. And if you're using threading and sys.exc_info you may even get the wrong traceback (especially if you're using exception handling for control flow - which I'd personally tend to avoid.)

Python 3, Exception chaining

In Python 3, you can chain Exceptions, which preserve tracebacks:

    raise RuntimeError('specific message') from error

Be aware:

  • this does allow changing the error type raised, and
  • this is not compatible with Python 2.

Deprecated Methods:

These can easily hide and even get into production code. You want to raise an exception, and doing them will raise an exception, but not the one intended!

Valid in Python 2, but not in Python 3 is the following:

raise ValueError, 'message' # Don't do this, it's deprecated!

Only valid in much older versions of Python (2.4 and lower), you may still see people raising strings:

raise 'message' # really really wrong. don't do this.

In all modern versions, this will actually raise a TypeError, because you're not raising a BaseException type. If you're not checking for the right exception and don't have a reviewer that's aware of the issue, it could get into production.

Example Usage

I raise Exceptions to warn consumers of my API if they're using it incorrectly:

def api_func(foo):
    '''foo should be either 'baz' or 'bar'. returns something very useful.'''
    if foo not in _ALLOWED_ARGS:
        raise ValueError('{foo} wrong, use "baz" or "bar"'.format(foo=repr(foo)))

Create your own error types when apropos

"I want to make an error on purpose, so that it would go into the except"

You can create your own error types, if you want to indicate something specific is wrong with your application, just subclass the appropriate point in the exception hierarchy:

class MyAppLookupError(LookupError):
    '''raise this when there's a lookup error for my app'''

and usage:

if important_key not in resource_dict and not ok_to_be_missing:
    raise MyAppLookupError('resource is missing, and that is not ok.')


+ Recent posts