There are a number of ways to find and extract records from BibServer.
Via the frontend
The first, of course, is to find them by searching via the web frontend, as demonstrated at http://bibsoup.net. This provides an easy search interface for people wanting to find new records, or create and view collections.
Of course, you will probably want to get the records in a more useful format. When browsing a collection, you will find the “download this collection” option on the bottom left. It is also possible to request JSON directly on these pages.
Requesting JSON in the URL
From any search result page on the frontend, or from any page of a specific record, you can request the content that you can see as JSON – specifically, as BibJSON. To do this, just append .json or .bibjson to the URL you are viewing. Note that the URL comes before the query parameters – so if the URL has, for example, “?size=50″ at the end, you should add “.json” before the “?”
Requesting JSON as a query parameter
From any search result page on the frontend, or from any page of a specific record, you can request the content as JSON by appending a “format=json” query parameter. Note that query parameters should be separated by “&”, so make sure to add one if necessary. Note also, if you do not already have any query parameters, you will need to add a “?” to the end of the URL first. For example:
- http://bibsoup.net/collection/pitman has no query parameters – so add “?format=json” to get http://bibsoup.net/collection/pitman?format=json
- http://bibsoup.net/collection/pitman?size=50 already has a query parameter, so add “&format=json” to get http://bibsoup.net/collection/pitman?size=50&format=json
By content negotiation
THIS IS NOT ACTIVE YET. The aforementioned method of appending the URL with .json or .bibjson can be avoided when requesting content programmatically, by performing content negotiation. Just specify JSON as your preferred format in your request header.
Directly from the index API
You can also query the elasticsearch index API directly. For example, see http://bibsoup.net/query. The elasticsearch index is well documented at http://www.elasticsearch.org. Here is an example for returning the records of a particular collection:
This will return the first 10 results of the named collection. To return more, you can alter the start and from parameters, for example:
will get you the next 10 records. To get all the records, just set the size to be larger than that of the current collection
When you perform these queries, your result object will be JSON, and will contain some elasticsearch metadata and your records. Here are some examples of what you can find:
- resultobject["hits"]["total"] – the total number of records your search parameters found. So this could tell you how many records are in the pitman collection if you performed the above search
- resultobject["hits"]["hits"] – a list of the records found for your current search, limited to the amount you requested or the default of 10.
- resultobject["hits"]["hits"]["_source"] – inside every record returned you will find some more metadata about how it scored in the search, and you will find the content of the record in the _source key.
- So, for example in python, you could get every record found for your search into a list by doing this:
- results = [i["_source"] for i in resultobject["hits"]["hits"]]
We will soon be adding to the functionality so that you can more easily retrieve a record or a collection without the surrounding elasticsearch metadata. But for now, this provides a useful way to get your data out.