5 ARCHITECTURE
6.5 G ITHUB REST S ERVICES ☁
In this section we want to explore a more features of REST services and how to access them. Naturally many cloud services provide such REST sinterfaces. This is valid for IaaS, PaaS, and SaaS.
Instead of using a REST service for IaaS, let us here inspect a REST service for the Github.com platform.
Its interfaces are documented nicely at
https://developer.github.com/v3/
We see that Github offers many resources that can be accessed by the users which includes
Activities Checks Gists Git Data
GitHub Apps Issues Migrations Miscellaneous Organizations Projects Pull Requests Reactions Repositories Searches Teams Users
Most likely we forgot the one or the other Resource that we can access via REST. It will be out of scope for us to explore all of these issues, so let us focus on how we for example access Github Issues. In fact we will use the script that we use to create issue tables for this book to showcase how easy the interaction is and to retrieve the information.
6.5.1 Issues
The REST service for issues is described in the following Web page as specification
https://developer.github.com/v3/issues/
We see the following functionality:
List issues
List issues for a repository Get a single issue
Create an issue Edit an issue Lock an issue Unlock an issue Custom media types
obtain information about the issues. Such as
As response we obtain a json object with the information we need to further process it. Unfortunately, the free tier of github has limitations in regards to the frequency we can issue such requests to the service, as well as in the volume in regards to number of pages returned to us.
Let us now explore how to easily query some information. In our example we like to retrive the list of issues for a repository as LaTeX table but also as markdown. This way we can conveniently integrate it in documents of either format. As LaTeX has a more sophisticated table management, let us first create a LaTeX table document and than use a program to convert LaTeX to markdown. For the later we can reuse a program called pandoc that can convert the
table for LaTeX to markdown.
Let us assume we have a program called issues.py that prints the table in markdown
format
An example for such a program is listes at.
https://github.com/cloudmesh-community/book/blob/master/bin/issues.py
Although python provides the very nice module requests which we typically use for
such issues. we have here just wrapped the commandline call to curl into a system
command and redirect its output to a file. However, as we only get limited information back in pages, we need to continue such a request multiple times. To keep things simple we identified that for the project at this time not more that n pages need to be fetched, so we append the output from each page to the file. Your task is it to improve this script and automatize this activity so that no maximum fetches have to be entered.
The reason why this program is so short is that we leverage the build in function for json data structure manipulation, hear a read and a dump. When we look in the issue.json file that is created as intermediary file we see a list of items such as
GET /issues GET /user/issues
[ ... { "url": "https://api.github.com/repos/cloudmesh-community/book/issues/46", "repository_url": "https://api.github.com/repos/cloudmesh-community/book", "labels_url": "https://api.github.com/repos/cloudmesh-community/book/issues/46/labels{/name}", "comments_url": "https://api.github.com/repos/cloudmesh-community/book/issues/46/comments", "events_url": "https://api.github.com/repos/cloudmesh-community/book/issues/46/events", "html_url": "https://github.com/cloudmesh-community/book/issues/46", "id": 360613438, "node_id": "MDU6SXNzdWUzNjA2MTM0Mzg=", "number": 46,
"title": "Taken: Virtualization",
"user":{ "login": "laszewsk", "id": 425045, "node_id": "MDQ6VXNlcjQyNTA0NQ==", "avatar_url": "https://avatars1.githubusercontent.com/u/425045?v=4", "gravatar_id": "", "url": "https://api.github.com/users/laszewsk", "html_url": "https://github.com/laszewsk", "followers_url": "https://api.github.com/users/laszewsk/followers", "following_url": "https://api.github.com/users/laszewsk/following{/other_user}", "gists_url": "https://api.github.com/users/laszewsk/gists{/gist_id}", "starred_url": "https://api.github.com/users/laszewsk/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/laszewsk/subscriptions", "organizations_url": "https://api.github.com/users/laszewsk/orgs", "repos_url": "https://api.github.com/users/laszewsk/repos", "events_url": "https://api.github.com/users/laszewsk/events{/privacy}", "received_events_url": "https://api.github.com/users/laszewsk/received_events", "type": "User", "site_admin":false }, "labels": [], "state": "open", "locked": false, "assignee":{ "login": "laszewsk", "id": 425045, "node_id": "MDQ6VXNlcjQyNTA0NQ==", "avatar_url": "https://avatars1.githubusercontent.com/u/425045?v=4", "gravatar_id": "", "url": "https://api.github.com/users/laszewsk", "html_url": "https://github.com/laszewsk", "followers_url": "https://api.github.com/users/laszewsk/followers", "following_url": "https://api.github.com/users/laszewsk/following{/other_user}", "gists_url": "https://api.github.com/users/laszewsk/gists{/gist_id}", "starred_url": "https://api.github.com/users/laszewsk/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/laszewsk/subscriptions", "organizations_url": "https://api.github.com/users/laszewsk/orgs", "repos_url": "https://api.github.com/users/laszewsk/repos", "events_url": "https://api.github.com/users/laszewsk/events{/privacy}", "received_events_url": "https://api.github.com/users/laszewsk/received_events", "type": "User", "site_admin":false }, "assignees": [ { "login": "laszewsk", "id": 425045, "node_id": "MDQ6VXNlcjQyNTA0NQ==", "avatar_url": "https://avatars1.githubusercontent.com/u/425045?v=4", "gravatar_id": "", "url": "https://api.github.com/users/laszewsk", "html_url": "https://github.com/laszewsk", "followers_url": "https://api.github.com/users/laszewsk/followers", "following_url": "https://api.github.com/users/laszewsk/following{/other_user}", "gists_url": "https://api.github.com/users/laszewsk/gists{/gist_id}", "starred_url": "https://api.github.com/users/laszewsk/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/laszewsk/subscriptions", "organizations_url": "https://api.github.com/users/laszewsk/orgs", "repos_url": "https://api.github.com/users/laszewsk/repos", "events_url": "https://api.github.com/users/laszewsk/events{/privacy}", "received_events_url": "https://api.github.com/users/laszewsk/received_events", "type": "User", "site_admin": false } ], "milestone":null, "comments": 0, "created_at": "2018-09-16T07:35:35Z",
As we can see from this entry there is a lot of information associated that for our purposes we do not need, but certainly could be used to mine github in general. We like to point out that github is actively mined for exploits where passwords are posted in clear text for AWS, Azure and other clouds. This is a common mistake as many sample programs ask the student to place the password directly into their programs instead of using a configuration file that is never part of the code repository.
6.5.2 Exercise
E.github.issues.1:
Develop a new code like the one in this section, but use python requests instead of the os.system call.
E.github.issues.2:
In the simple program we hardcoded the number of page requests. How can we find out exactly how many pages we need to retrieve? Implement your solution
E.github.issues.3:
Be inspired by the many REST interfaces. How can they be used to mine interesting things.
E.github.issues.4:
Can you create a project, author, or technology map based on information that is available in github. For example python projects may include a requirements file, or developers may work on some projects together, but others do other projects with others can you create a network?
"updated_at": "2018-09-16T07:35:35Z",
"closed_at":null,
"author_association": "CONTRIBUTOR",
"body": "Develop a section about Virtualization" },
...
E.github.issues.5:
Use github to develop some cool python programs that show some statistics about github. An example would be: Given a github repository, show the checkins by data and visualize them graphically for one committer and all committers. Use bokeah or matplotlib. E.github.issues.6:
Develop a python program that retrieves a file. Deevlop a python program that uploads a file. Develop a class that does this and use it in your proggram. Use docopt to create a manual page. Please remember this prepares you for your project so this is very useful to do.
6.6 OPENAPI REST SERVICES WITH SWAGGER ☁
Swagger https://swagger.io/ is a tool for developing API specifications based on the OpenAPI Specification (OAS). It allows not only the specification, but the generation of code based on the specification in a variety of languages.
Swagger itself has a number of tools which together build a framework for developing REST services for a variety of languages.
6.6.1 Swagger Tools
The major Swagger tools of interest are: Swagger Core
includes libraries for working with Swagger specifications
https://github.com/swagger-api/swagger-core. Swagger Codegen
allows to generate code from the specifications to develop Client SDKs,
servers, and documentation. https://github.com/swagger-api/swagger-
Swagger UI
is an HTML5 based UI for exploring and interacting with the specified APIs https://github.com/swagger-api/swagger-ui
Swagger Editor
is a Web-browser based editor for composing specifications using YAML
https://github.com/swagger-api/swagger-editor
Swagger Hub
is a Web service to collaborativly develop and host OpenAPI specifications
https://swagger.io/tools/swaggerhub/
The developed APIs can be hosted and further developed on an online repository
named SwaggerHub https://app.swaggerhub.com/home The convenient online
editor is available which also can be installed locally on a variety of operating systems including macOS, Linux, and Windows.
6.6.2 Swagger Community Tools
notify us about other tools that you find and would like us to mention here.
6.6.2.1 Converting Json Examples to OpenAPI YAML Models
Swagger toolbox is a utility that can convert json to swagger compatible yaml models. It is hosted online at
https://swagger-toolbox.firebaseapp.com/
The source code to this tool is available on github at
https://github.com/essuraj/swagger-toolbox
It is important to make sure that the json model is properly configured. As such each datatype must be wrapped in “quotes” and the last element must not have a
In case you have large models, we recommend that you gradually add more and more features so that it is easier to debug in case of an error. This tool is not designed to provide back a full featured OpenAPI, but help you getting started deriving one.
Let us look at a small example. Let us assume we want to create a REST service to execute a command on the remote service. We know this may not be a good idea if it is not properly secured, so be extra careful. A good way to simulate this is to just use a return string instead of executing the command.
Let us assume the json schema looks like:
The output the swagger toolbox creates is
As you can see it is far from complete, but it could be used to get you started. Based on this tool develop a rest service to which you send a schema in JSON format from which you get back the YAML model.