API stands for Application Programming Interface. Simply put, it is code written to interact with something. For example if you want to write a program to make your computer perform text to speech you can either get your PhD in digital signal processing and write it yourself, or you can find someone else's code and figure out how to integrate it within your application. Programmers do the latter all the time, which is called consuming the API. API's make programming easier because we re-use other people's code and don't have to write everything ourselves.
You have been consuming API's all along in this course. Every time we import
a Python module we are adding other people's code to our programs, and consuming their API. In addition, the built in Python string functions like startswith()
and upper()
are part of Pythons internal API. If we want to convert a string to upper case, we don't have to write that code ourselves, we can just use the built in string API!
What makes Python so great is there are many useful API's built into the language. There is also a website, called the Python Package Index https://pypi.org/ where you can find other API's and then install them into your Python environment using the pip
utility.
A web API is an API that gets called over the world-wide-web. In this scenairo, the API code is not on your computer. It's on another computer on your network (almost always a web server over the Internet). When your code wants to call the Web API:
Web API's are commonplace in the era of cloud computing, but the question is: why?
Initially the web focused on the direct user-consumption of information. We used a browser and search engine to get the news, sports scores, watch youtube or get the latest weather forecast. We did not need web API's because all of this consumption took place in a web browser, and the services we used could just send us HTML, which our browser rendered into nice looking webpages.
The emergence of smart devices like watches, phones, media players, and intelligent speakers has caused a shift in the web from user-based to device-based. Most of the information we consume nowadays is no longer in a browser, but instead on a variety of devices. We watch movies on our Roku players, check the news and sports scores on our smart phones, or get weather reports from Alexa. For example:
You might think you need to know a lot about networking to call a web API, but actually HTTP (Hyptertext Transport Protocol) handles most of the details for you. HTTP is the protocol that makes the web work. You do need to understand some basics of HTTP in order to correctly consume web API's.
The Hyptertext Transport Protocol (HTTP) is the means by which hosts on the web communicate. The communication occurs in a request/response pair.
A URL consists of the name of the web server (the host) the path to the resource, and optional query arguments. For example this URL accesses the iSchool's Spring 2019 Undergraduate class schedule:https://ischool.syr.edu/classes/spring-2019/undergraduate.
In addition, you can include an optional query which affects how the resource returns a reponse. For example here's page 2 of the class schedule: https://ischool.syr.edu/classes/spring-2019/undergraduate/?page=2. URL's are controlled by the web developer and/or application running on the server. The client does not have any say is how the URL looks or behaves, just like you cannot choose another person's phone number or email address. One must simply know the URL to access the resource.
In addition to the URL, an HTTP Verb must be included in the request. The verb specifies how the resource will be accessed.
When the server returns a response to a request, included in the response is the HTTP status code, which tells the client whether or not the request worked. Status codes are 3 digit numbers and the first number indicates the type of response:
Included in any response, whether successful or not, is a response body. The response body contains the actual content. With most browser responses the content is HTML (Hypertext Markup Language). HTML is a content type for rendering data in a webpage; It has data and also layout information for how the page should look.
HTML is not a suitable format for Web API's because we only want the data - not the layout. As such most web API's return a response type in XML extensible markup language or JSON Javascript Object Notation formats. Both formats only contain data.
Because the reciever of the information is now a device as opposed to a web browser, HTML is not a suitable format. HTML includes presentation and layout information with the data, making it difficult for a device to process. Instead we just want the API to return the data in a structured format so that our consumer program can extract the information we need.
Most web API's return data back to the consumer in JSON (JavaScript Object Notation) format. JSON is a lightweight data format which is easy for people to read and write.
Here's an example JSON response we might get from a weather API:
{ 'location' : 'Syracuse, NY',
'time' : '2018-10-31 9:00+05:00',
'temperature' : 59.6,
'humidity' : 95.5
}
If you think that looks a lot like a Python dictionary object, you would be right! It's very easy to convert JSON into a Python object (a process called de-serialization) and to conver a Python object back into JSON (that's called serialization). This makes Python a very capable language for consuming web API's.
There were a lot of concepts in this section, let's distill them into aa generic algorithm for consuming web API's in Python.
To call a web API in Python, we:
That's it! The only thing that changes about the process is:
Before we call Web API's in Python, we must first understand how to make HTTP requests. For this, we will use the requests
module: http://docs.python-requests.org/en/master/user/quickstart/. This library is based on Python's internal urllib
library, but is more expressive and easier to understand. requests
makes dealing with the HTTP protocol very pleasant by eliminating a lot of the boilerplate code you need to make a request and handle the response.
When we use urllib
we there is more boilerplate code - extra code we must write to make things work. Making the request and converting the response to a Python object are both 2 step processes.
import urllib.request, urllib.error, urllib.parse
import json
try:
data = { 'name' : 'mike', 'age' : 45 }
request = urllib.request.Request('https://httpbin.org/get?' + urllib.parse.urlencode(data) ) # make the request URL
response = urllib.request.urlopen(request) # execute the request
raw_data = response.read() # read the data
object_data = json.loads(raw_data) # deserilaize the data into python object
print(object_data)
except urllib.error.HTTPError as e:
print(e)
The requests
module requires just one line of code for request and response. All of the dirty work of encoding the URL and deserializing the response are handled for us!
import requests
data = { 'name' : 'mike', 'age' : 45 }
response = requests.get('https://httpbin.org/get', params = data) # make and execute the request to URL in one step!
if response.ok:
object_data = response.json() # read and de-serialize in one step!
print(object_data)
else:
print(response.status_code, response.reason)
To make a request simply call requests.get()
with the url
string as the argument. This example gets the contents of the URL https://httpbin.org/html which returns a section of the novel Moby Dick as an HTML page.
response = requests.get('https://httpbin.org/html')
html = response.text
print(html[:296], '...') # just the first 296 characters, please
How do you know if the request worked? You should check the response.ok
variable. This is True
when the HTTP response code is 200
. A response code of 200
has the reason OK
.
response = requests.get('https://httpbin.org/html')
print("OK?", response.ok)
print("HTTP Status Code:", response.status_code)
print("HTTP Status Code Reason:", response.reason)
Here's an example of a response which is not OK
. I'm requesting the URL https://httpbin.org/mikefudge which should not be found on that web server. This yields a response status code of 404
and a reason of NOT FOUND
.
response = requests.get('https://httpbin.org/mikefudge')
print("OK?", response.ok)
print("HTTP Status Code:", response.status_code)
print("HTTP Status Code Reason:", response.reason)
The HTTP response is stored in a Python variable called response
. We can get the raw response as a string by asking for response.text
. Here is the raw response from the URL https://httpbin.org/get which returns JSON as a Python string:
response = requests.get('https://httpbin.org/get')
if response.ok:
print(response.text)
else:
print(response.status_code, response.reason)
If the response is in JSON format, we can easily deserialize the response into a Python object by calling the response.json()
. For example we call the same URL https://httpbin.org/get, but this time easily extract the "origin"
key from the Python object. It is far easier to extract information from a Python object than it is to search for what you need within the equivalent string!
response = requests.get('https://httpbin.org/get')
if response.ok:
py_object = response.json() # de-serialize the string into a Python object!
print("Python Object: ", py_object, '\n')
print("Type of Object: ", type(py_object), '\n')
print("Just the value of the 'origin' key: ", py_object['origin'], '\n')
else:
print(response.status_code, response.reason)
What happens if you try and deserialize response content which is not in JSON format? You will get an exception of type json.decoder.JSONDecodeError
. For example when we try to request the HTML at 'https://httpbin.org/html', the response is OK
but the response.json()
function fails to decode the text into a Python object:
try:
url = 'https://httpbin.org/html'
response = requests.get(url)
if response.ok:
print("HTTP Status Code:", response.status_code, response.reason)
print(response.json()) # ERROR This is not JSON!!!
except json.decoder.JSONDecodeError:
print("Sorry the contents of", url, "are not in the JSON format!")
Here's some boilerplate code to make an HTTP request. Just change the URL and go! If the response is JSON, it will deserialize it into the variable pyobj
. The code handles invalid HTTP responses and content which is not JSON. Any time you call a web API your code should be similar to this!
try:
url = 'Any-URL-Here'
response = requests.get(url)
if response.ok:
pyobj = response.json()
print("YOUR PYTHON OBJECT: ", pyobj)
else:
print(response.status_code, response.reason)
except json.decoder.JSONDecodeError:
print("Sorry the response from", url, "is not in the JSON format!")
As a general rule, you do not want error checking inside the function. Error handling is the responsbility of the caller. For example
def make_request(url):
response = requests.get(url)
response.raise_for_status()
pyobj = response.json()
return pyobj
try:
url = 'Your-Url-Here'
data = make_request(url)
except requests.exceptions.HTTPError as e:
print("There was an invalid HTTP response code ", e)
except json.decoder.JSONDecodeError:
print("Sorry the response could not be deserialized into JSON format!")
For more information please review the Python requests quickstart: https://requests.readthedocs.io/en/master/user/quickstart/
Now that you understand how to initiate HTTP requests and handle responses in Python code, we will conclude this reading by demonstrating several common methods by which web API's are called. In every example the response will be in JSON format, which is by far the most popular format for web API's. We will omit any boilerplate code to check the response status code and format so that we can instead focus on how to properly formulate requests.
If you want to understand how to consume a given Web API, you must read the docs. Any web API you are going to consider will include instructions for how to consume the API. They won't always have relevant examples in Python, but instead will explain things like whether you need to add an HTTP header or place an Argument on the Query string. Its up to YOU to figure out how to translate that into Python code. This part of the guide will walk you through most of the common ways this is done, serving as a cookbook of sorts.
Sadly, the days of free access to Web API's is coming to an end. There's simply no money to be made in offering your serivce to other programmers if you cannot profit. For example, if I offer a weather API for free there will be hundereds of weather apps in the app store using my API. Those apps will charge customers and/or show advertisements and I will not make any money from the consumers while they profit off my service. If I want make money off my API, I must force the API consumer to register and pay per use.
Most API's we will use do this. They have a free tier for testing / trying out the API, which is usually sufficient for your demo day project. The web API's we consume in this reading are all free - an exception and not the rule! I chose free API's so that you can focus on the requests and reponses rather than the logistics of acquiring an API key.
The simplest Web API call is an unauthenticated HTTP GET request with no arguments. Sadly, these types of API's are rare nowadays.
This example will call the httpbin Web API to help you answer what is the IP Address of my computer? the IP address is returned in the origin
key:
web_api_url = 'https://httpbin.org/ip'
response = requests.get(web_api_url)
response.json()
Some API's allow you to include arguments on the query string. These are added to the end of the URL and effect what the URL does with the request.
In this example we will call the open street maps web API. We will request the search URL but include parameters for what we're searching for (in this case Animal Kingdom Lodge) and the format we would like the response (JSON).
We build a Python dictionary variable of query parameters required by the web API on line 2, and then pass them to the requests.get()
function as the named argument params = options
.
To refresh your memory, a named argument is optional in a function call. If you include it you must assign the parameter to the argument. In the example the parameter is params
and the argument (what goes into the function) is the variable options
.
web_api_url = 'https://nominatim.openstreetmap.org/search'
options = { 'q' : 'Animal Kingdom Lodge', 'format' : 'json'}
response = requests.get(web_api_url, params = options)
print("The URL is:", response.url)
response.json()
Some web API's like Reddit require you to include values in the HTTP Header. For the Reddit API you need a custom User-Agent
key with a value which indicates what your application does.
For example, this code requests the top stories from subreddit /r/news
in JSON format. We include the headers required by adding a named argument headers = custom_headers
to the requests.get()
function.
web_api_url = 'https://www.reddit.com/r/news/top.json?count=2'
custom_headers = {'User-Agent' : 'sample-python-application'}
response = requests.get(web_api_url, headers = custom_headers)
response.json()
Too many stories to handle? Well the API allows you to include a query string parameter for story limit. This example shows you can combine the headers
and params
named arguments if your API requires it. This is true for any function call with named arguments, BTW.
NOTE: 'Accept' : 'application/json'
in the header asks the API to return the respose as json instead of XML.
In this example we return just one 1 story!
web_api_url = 'https://www.reddit.com/r/news/top.json'
custom_headers = {'User-Agent' : 'sample-python-application', 'Accept' : 'application/json'}
options = { 'limit' : 1 }
response = requests.get(web_api_url, headers = custom_headers, params = options)
response.json()
Some web API's require you to send a substantial amount of data to them. In this case an HTTP POST a more appropriate request than a GET. For example to get sentiment from text through the text-processing web API, you must include the text as part of the data payload. We accomplish this through the requests.post()
function and the data = payload
named argument:
tweet = "I dislike the Voice. I will not be sad when that show is cancelled. Horrible!"
web_api_url = 'http://text-processing.com/api/sentiment/'
payload = { 'text' : tweet }
response = requests.post(web_api_url, data = payload)
response.json()
Most API's require Authentication. At the very mimimum, authentication is a means to track who is calling the API so the service provider may place limits on your use of it. In the case of social network API's you use the API to and permission on behalf of another user. In all cases of API authentication the service knows who you are, and the credentials allow your computer program to act on your behalf. Therefore is is of utmost importance that you protect your API keys from falling into the wrong hands!
In this section we will explain some of the common ways that API Authentication is implemented.
IMPORTANT NOTE You will see examples of API's being called, but the credentials used here will not work. Get your own set of credentials to run the example code.
API Keys are the simplest form of authentication. You must first sign up for the API on their website and in exchange some personal information you will be issued an API key. This key is required to send requests to the API so they can track how many requests you are making. The key is usually a randomly generated set of characters which is unique to the service.
Where you put the API key depends on the service you are using. There is no one way to do this and you will need to read through the documentation to figure it out. Common places are:
The Darksky API https://darksky.net/dev is an example of a service where the key is in the URL. For these types of services you will need to build the URL before making the request. This is a job best handed by Python's f-strings. Here's an example of the API call getting the current weather conditions at coordinates 37.8267,-122.4233
darksky_key = 'zyx1234567890abcdefg'
lat = 37.8267
long = -122.4233
url = f"https://api.darksky.net/forecast/{darksky_key}/{lat}/{long}"
response = requests.get(url)
print(response.url)
The Zomato API https://developers.zomato.com/api is an example of a service where the key is placed in the HTTP request header. For this type of authentication, you will need to know the name of the header the key should be placed under. It could be api-key
or x-api-key
or user_key
or key
or mike-fudge
. Again you will need to read through the documentation to figure it out.
With this approach, you need to pass a dictionary into the headers=
named argument. For example:
zomato_key = 'zyx1234567890abcdefg'
custom_headers = { 'user-key': zomato_key, 'Accept' : 'application/json' }
url = 'https://developers.zomato.com/api/v2.1/categories'
response = requests.get(url, headers = custom_headers )
print(response.url)
Also notice the header includes 'Accept' : 'application/json'
this is a common value placed in the HTTP request header to ask the API to return the results in JSON format. If your response is coming back in XML format, use this to option in the header to request a response in JSON format.
Most of the social service API's like facebook, twitter, yelp, and spotify use the OAUTH2 protocol for authentication. OAUTH (https://www.digitalocean.com/community/tutorials/an-introduction-to-oauth-2) is a complex protocol with a variety of flexible authentication workflows. The easist of the OAUTH2 workflows is the client credentials flow and if your API support this one, USE IT. Basically its a two step process
The details:
client_id
(think of this as a unique username for your API) and a client_secret
(consider this your client_id's password). this means you are not authenitcating with your own credentials. The service will provide you with an authorization endpoint which is token issuer. In Effect you "Log in" to the service not with your username and password but with the client_id
and client_secret
instead. client_id
and client_secret
, typically in the query string. This asks the API to authenticate your credentials and issue you a token
in the response headers under the key Authroization
. The token returned is a very long text string. Tokens expire and cannot be used forever.{ 'Authorization' : 'Bearer ' + token }
As a best practice you should renew the token only when it expires.
Here's an example of what this might look like for a given API. In this example, the API gets the token using the HTTP GET method, which is less secure and not used as much anymore.
# OAUTH2 CLIENT CREDENTIALS FLOW - GET METHOD
client_id='abc123'
client_secret='zyx1234567890abcdefg'
# get the Bearer token, HTTP GET Method
query_string = { 'grant_type' : 'client_credentials', 'client_id' : client_id, 'client_secret' : client_secret }
token_url = 'https://someservice.com/token'
auth_response = requests.get(endpoint_url, params=query_string)
token = auth_response.headers['Key-For-Access-Token']
# now that we have the token we can keep using it in our api calls.
api_url = 'https://someservice.com/api'
api_headers = {'Authorization' : f"Bearer {token}" }
api_response = requests.get(api_url, headers = api_headers)
data = api_response.json()
Here is another example, this time the service uses the HTTP Post method. This is a more secure method than the GET method because the client_id
and client_secret
are not visible on the URL. Most services will use this pattern today:
# OAUTH2 CLIENT CREDENTIALS FLOW - POST METHOD
client_id='abc123'
client_secret='zyx1234567890abcdefg'
body = { 'grant_type' : 'client_credentials'}
# get the Bearer Token, HTTP POST method
token_url = 'https://someservice.com/token'
response = requests.post(token_url, auth=(client_id, client_secret), data=body)
token = response.headers['Key-For-Access-Token']
print(token)
# now that we have the token we can keep using it in our api calls.oken
headers = { "Authorization" : f"Bearer {token}" }
query_string = { 'q' : 'search-example' }
endpoint_url = 'https://someservice/api/search'
response = requests.get(endpoint_url, headers=headers, params = query_string)
data = response.json()
There is no one way to call every API and thus you will need to read the documentation and play around with your code in order to learn how to correctly authenticate to the API and retrieve a valid response. Take a look at the examples and documentation provided.
Lots of trial and error are required, so be persistent! We expect everyone to suffer through this process as part of what it takes to complete your demo day project. This shows us you are capable of learning new things independently and thus are ready to handle the next great thing Python throws our way!