Skip to content

Incrementally creates a swagger api spec from raw data coming from logs, monitoring agent, etc.

License

Notifications You must be signed in to change notification settings

solso/raw2swagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Raw2Swagger

## Description

Generate swagger specs from raw data of the utilization of the API rather than from the annotations on the API.

Typically swagger spec is written and published by the API owners via annotation on the source code of their API. The problem is that not all providers bother to use Swagger to describe their API.

Raw2swagger intends to create an accurate swagger specs from raw data using unsupervised learning on raw data that you, an API consumer, can collect. For instance instrumenting the API plugin or storing the requests logs against the API that you want to built the swagger spec for.

What is Swagger?

Swagger is a specification (JSON-based) to describe REST APIs.

The swagger spec can be used to automatically generate beautiful documentation for your API that can be used interactively by your users.

Many API management solutions like 3scale accept swagger-specs to describe your API so that documentation looks like this

Swagger offers a framework to build the API description as annotation on your source code for JAVA family languages (using JAX). There are also tools like source2swagger that are language agnostic.

Getting Started

Installation

To install from rubygems

gem install raw2swagger

from source

git clone https://github.com/solso/raw2swagger.git
gem build raw2swagger.gemspec
gem install raw2swagger

Using Raw2Swagger

To create the feeder object,

require 'raw2swagger'
feeder = Raw2Swagger::Feeder.new()

Add a new raw entry,

feeder.process(entry)

An entry is a Hash that can be generated from logs, plugins, etc. Up to you. The hash looks like this:

{
  "method" => "GET",
  "path" => "/admin/api/accounts/34333.xml",
  "status" => 200,
  "query_string" => "provider_key=foo&page=30%per_page=10",
  "body" => "",
  "host" => "raw2swagger.3scale.net",
  "port" => 80,
  "headers" => {}
}

Another example on an entry:

{
   "method" => "POST",
   "path" => "/resources.json",
   "status" => 200,
   "query_string" => "",
   "body" => {"id" => 10, "foo" => "bar"},
   "host" => "raw2swagger.3scale.net",
   "port" => 80,
   "headers" => {"Content-Type" => "application/json"}
 }

The query_string and request body fields can either be String or a Hash. The headers field must always be a Hash.

The field Host is used to distinguish between APIs. You can use the same Feeder for multiple API's.

After adding entries, raw2swagger will start figuring out the swagger spec, you can get it anytime with

swg = feeder.spec("raw2swagger.3scale.net").to_swagger()

Since Feeder can hold different specs you must use the Host field to access the spec of the API that you want.

You only need to save it a file

f = File.new("my_autogenerated_swagger_spec.json", "w")
f.puts swg.to_json()
f.close

The file sample_api_traffic.log contains +2000 entries from 3scale Account Management API that are used for tests.

How does it work?

After creating a Feeder object, you can start by processing a single entry

GET /accounts/42.xml?key=foo

At this point raw2swagger will think that the API has only one resource (1 end-point) with one operation (a GET):

/accounts/42.xml
with a GET operation
parameters: [key]

If you feed two more entries like these

GET /accounts/54.xml?key=bar
POST /users.xml

raw2swagger will output an improved spec that has 2 resources,

/accounts/{account_id}.xml 
with a GET operation
parameters: [account_id, key]

and

/users.xml
with a POST operation
parameters: []

Note that it has learned that there is an account_id parameter. Therefore the previous entry /accounts/42.xml is subsumed since /accounts/{accounts_id}.xml is more general. Also, it knows that account_id is a required parameter.

The more entries you feed to raw2swagger, the better the spec will become.

Considerations

Feeder has one optional parameter

f = Feeder.new(:occurrences_threshold => 0.20)

The default of :occurrences_threshold is 1.0, which will generalize paths as soon as possible on an eager fashion.

If you never want to generalize paths use 0.0.

Typically 1.0 works very well but for some API's it can cause an over-generalizations, for instance:

GET /api/users.xml
GET /api/applications.xml
GET /api/features.xml

Might end up having

GET /api/{api_id}.xml

To avoid such events, we recommend tuning the :occurrences_threshold value until you are ok with the results, 0.2 works pretty well.

The 0.2 means that it will only generalize two segments of a path if both segments are 5 times less frequent (1/0.2) that the most frequent path segment. This is an heuristic that helps distinguish those path segments that are parameters from those who are constants.

Contribute

Fork the project and send pull requests.

About

Incrementally creates a swagger api spec from raw data coming from logs, monitoring agent, etc.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages