-
Notifications
You must be signed in to change notification settings - Fork 44
Home
This page provides an overview of the purpose and use of a Dataset Site. It's aim is to help anyone in an organisation create a Dataset Site for their data (not just the developers).
The Dataset Site Generator allows providers to create a subsite that contains license info and other metadata. A dataset homepage and documentation are required for providers opening their data, and the generator provides this and more. The Dataset Site Generator includes a template site, and complete step-by-step guides designed for non-technical users.
To publish open data for anyone to freely access, use and share, you must create a webpage that describes the data you are publishing. You must include relevant licensing information and documentation. You must also specify how dataset users (innovators who want to build on top of/use your data) should attribute your data. This a Dataset Site.
If you are publishing data using the Openactive specifications, you need a Dataset Site.
Take a look at examples from British Cycling and GoodGym.
The purpose of a Dataset Site is to provide:
- A web page that can be referenced when discussing the dataset.
- A human and machine readable license associated with the data (the Dataset Page contains invisible metadata which allows its details to be read automatically).
- A human and machine readable rights statement to specify how dataset users (innovators who want to build on top of/use your data) should attribute your data.
- An accessible "single point of truth" that explains where the data can be found.
- Details ("documentation") and historical record ("changelog") relating to the format of the data, including the specifications it follows, and the data fields it contains.
- A place where the community can contribute with comments, and raise issues.
- A mailing list to which the data users can subscribe to get updates about changes to the data format, spec and fields.
The Dataset Site Generator and associated guides create a minimal Dataset Site using freely available, open source tools. A generated site contains features sufficient for publishing a single dataset, which in most cases is enough for initial publishing of data relating to Openactive.
Additional datasets can be easily added later, please raise an issue on this repository to request a guide for this.
Not at all. There are no risks associated with just having a go at using the guides in the next section. If it all goes wrong, you can just delete the repositories (defined in the next section) you've created and start again.
GitHub is a place where the open source community can collaborate.
A further explanation of GitHub terms to make this easier:
- Repository: A repository in GitHub is the name for a collection of Code, Issues, and a Wiki. The page you are looking at right now is inside a repository (this repository is called "dataset-site-generator". See the "openactive / dataset-site-generator" title at the top of the page).
- Organisation: This repository is called "dataset-site-generator", and it exists inside the organisation "openactive". See the "openactive / dataset-site-generator" title at the top of the page.
- Wiki: You are currently looking at the Wiki inside this repository (see the "Wiki" tab at the top of the page). A wiki is a collection of pages that can be easily edited. Some wikis are unrestricted (like this one), so they can be edited by anyone on GitHub (and all existing editors are notified of changes). Others are restricted to be editable only by GitHub users who have been granted access.
- Code: The code tab at the top of the page will show you the code in this repository, which can be edited.
- Issues: The issues tab at the top of the page is a place people can leave comments about the repository.
- Fork: Means to "copy", as in copy-and-paste a repository. A "fork" is a "copy" of a repository, and the forked repository always links back to the original. You can "fork" this repository to make your own Dataset Site by following one of the guides below.
- A GitHub account and GitHub organisation for you and your organisation, respectively.
- If you don't already have these, follow the guide in this document.
- A repository, containing a Dataset Site, which can be "forked" (copied) from this repository.
- Follow the guide in this document.
- A Mailchimp mailing list
- This allows dataset users (innovators who want to build on top of/use your data) to be kept up-to-date with changes to your data's format, spec and fields. Follow the guide in this document to create one of these.
- A repository containing Documentation, which can be created new, following the examples of others.
- This repository provides dataset users with documentation of data format, spec and fields, as well as allowing them to comment and raise issues. It also includes a historical record of changes to the data format, spec and fields (a "changelog"). Follow the guide in this document to create one of these.