New Connector: SurveyCTO connector
Their API documentation is behind a support login but it's pasted below in addition to a python library https://pypi.org/project/
Note that this documentation is comprehensive - all capabilities are listed. Also please bear in mind that this documentation is distributed with the assumption that interested parties have the expertise and/or time to work out what they need to do based on this documentation.
The REST API can be used by SurveyCTO users on trials, paid and community subscriptions to access server data for both encrypted and not-encrypted forms.
Downloading data in JSON format
The server also supports downloading wide-format data in JSON format, using a URL like this:
https://servername.surveycto.
The response is a JSON array. If there is no data, the response is an empty JSON array (i.e.,
"[]"). Here is a sample JSON response for a form that contains only two submissions:
[
"CompletionDate": "Oct 10, 2015 6:26:24 PM","SubmissionDate": "Oct 10, 2015 6:26:24 PM","starttime": "Oct 10, 2015 6:24:34 PM","endtime": "Oct 10, 2015 6:25:42 PM","deviceid": "99000211023220","subscriberid": "311480169577541","simid": "89148000001679945153","devicephonenum": "5089658843","consent": "1","name": "collector1","age": "100","instanceID": "uuid:4ecc7cc5-5697-4723-a56d-"formdef_version": "1411121011","review_quality" : "","review_status" : "APPROVED","KEY": "uuid:4ecc7cc5-5697-4723-a56d-
"CompletionDate": "Oct 13, 2015 10:53:04 AM","SubmissionDate": "Oct 13, 2015 10:53:04 AM","starttime": "Oct 13, 2015 10:52:31 AM","endtime": "Oct 13, 2015 10:52:50 AM","deviceid": "(web)","subscriberid": "","simid": "","devicephonenum": "","consent": "1","name": "John","age": "42","instanceID": "uuid:df31a3bc-2689-4b78-a9ea-"formdef_version": "1411121011","review_quality" : "","review_status" : "APPROVED","KEY": "uuid:df31a3bc-2689-4b78-a9ea-}] |
The JSON URL requires a date (request) parameter, which can be used to fetch data after a certain date (inclusive):
https://servername.surveycto.
Date (request) parameters should be specified in UTC time. Bypass and request all data by specifying date=0.
The value of the "date" parameter can follow one of the following formats:
- a timestamp in seconds,
- a timestamp in milliseconds (the value is treated in milliseconds when 13 (or more) digits are specified.
- a specific string representation as this appears in the data for the "CompletionDate" field (this field denotes when the data were finalized on the server; it matches the "SubmissionDate" when the submission came in complete in one HTTP post from Android, but it is always later than "SubmissionDate" when either (a) the submission came in in multiple posts from Android because of large attachments or (b) if the submission was incomplete and it was later accepted via the server console).
For example, in order to fetch data after midnight Oct. 12 2015, you can use a URL like one of the following:
https://servername.surveycto.
https://servername.surveycto.
https://servername.surveycto.
Please note that the third option uses the URL-encoded value of the date "Oct 13, 2015 00:00:00 AM". The result of this request will contain only the second submission from the example form above.
In order to download only new data with each request, you can pass a date parameter that contains the value of the latest CompletionDate received so far.
Notes on the format of the output
- The data in the JSON response for each individual submission do not depend on the latest deployed form definition like it used to do in the prior version of this REST API. That means that if some old submissions contain data for a field X but this field is no longer part of the latest form definition, then all old data of the field X will still be served by this version of the API.
- Missing data are not included in the JSON response.
Support for encrypted forms
From v2.60, the JSON end-point can also support pulling data from encrypted forms by specifying a private key file that can be used to decrypt the data before serving them. The private key file will not be stored in any of our servers and the decrypted data will also not be cached anywhere.
In order to specify the private key, you will need to do a form POST that will add enctype="multipart/form- to the request. The private key file should exist in a form field with the name "private_key".
For example, if you are using "curl", then this is how the form POST is succeeded using the -F option:
curl -u "username:password" -F 'private_key=@/Users/john/
Note that if you do not specify a private key file in your request, only fields marked as publishable in the form design will be returned (as in the case of direct downloads from the server console).
Downloading data in CSV format
To pull .csv data from a SurveyCTO server, use URLs like the following:
https://servername.surveycto.
https://servername.surveycto.
https://servername.surveycto.
The "servername", "formid", and "repeatgroup" in the above URLs need to specify the server name, the form ID, and the name of the repeat group, respectively.
The first URL will return a linebreak-delimited list of the URLs that return actual data. For example, for a simple form with no repeat groups, the first URL will basically return nothing but a variation of the second URL; for a form with repeat groups, it will return a variation of the second URL plus one or more variations of the third URL.
The second and third URLs return the actual .csv data, in the same format exported by the SurveyCTO Client (configured with default settings). The second returns the primary data of the form, and (variations of) the third return data for repeat groups. Again, the data is formatted just as it is exported from the SurveyCTO Client (based on ODK Briefcase) -- but the order of the columns may be in a slightly different order than the client versions.
Alternatively, you can download a single wide-format .csv file with a URL like this:
https://servername.surveycto.
The wide-format file will not include group names in variable names. In contrast, requests for long format data will return data that includes group names as a prefix to field names in variable names (as in groupname-fieldname).
| When downloading .csv data in wide-format, field data will be truncated at 16,384 characters. If you would like to download the complete data, either download the data in long-format, download the data in JSON format, or export the data using a different method, such as SurveyCTO Desktop. |
Linebreaks in CSV
There is a hidden server option to suppress linebreaks in these API-delivered .csv files. To replace all linebreaks with a certain replacement character, issue a POST request like the following:
https://servername.surveycto.
The [value] should be the URL-encoded replacement character. So, for example, to place all linebreaks with spaces:
https://servername.surveycto.
You only have to do this once per server, and then the setting will be remembered. Please just note that this setting applies to all new data exported to the .csv files returned by the API... not old data. So if you set the setting only after discovering linebreaks, the setting will only correct new data. It's probably best to always set it at the very beginning. Please also remember that you must issue this command as a POST request, not a GET request (so you cannot easily test it in a browser).
If you want to restore linebreaks (i.e., to no longer replace them), then issue a DELETE request like:
https://servername.surveycto.
Again, this particular function cannot be easily tested within a browser (without an add-on) because it requires that you issue a DELETE request.
Segmentation based on review status
Since v2.40, the server API includes data for approved submissions only. Starting from v2.51, you can also request data for rejected and/or pending-review submissions.
To request rejected submissions for the JSON format, use this URL:
https://servername.surveycto.
To request pending-review submissions for the JSON format, use this URL:
https://servername.surveycto.
Omitting the "r" parameter will fetch only the approved data (the default), but the same behavior can also be accomplished using "r=approved".
The "r" parameter can be used in all three endpoints: (a) "wide" JSON, (b) "long" CSV, and (c) "wide" CSV.
The "r" parameter can also accept more than one value, concatenated with | or with commas. For example, in order to fetch both approved and rejected submissions, the following URL can be used:
https://servername.surveycto.
Downloading server dataset data in CSV format
To pull .csv data for a server dataset from a SurveyCTO server, use the following URL:
https://servername.surveycto.
The "servername" and "datasetid" in the above URLs need to specify the server name and the dataset ID, respectively.
Authentication
The API supports basic authentication. So, if you use "curl" you should be able to authenticate just by supplying a username and password. You can also use your browser to test this, but only once you have successfully logged into the server in a different tab.
The user with which you authenticate must be a valid user on the Configure server console tab, and assigned a user role with permission to download data (i.e., a "data manager" or greater) and “Allow server API access” enabled:
For example, these are curl commands to pull JSON data for our nested-repeat sample form:
curl -u "username:password" "https://servername.surveycto.
Requests from users without API access will receive a 412 error:
{"error":{"code":412,"message":"API access not allowed. Authenticate as a user in a user role that allows API access, then try again.","responseObject":null}}
File attachments
Finally, downloaded data will include full URLs for any attached files (for example, for the response to image fields). These URLs will be of the following form:
https://servername.surveycto.
Fetching these attachments requires the same basic authentication as the overall API. For example, with curl:
curl -u "username:password" "https://servername.surveycto.
The "servername", "formid", "uuid", and "filename" in the above URLs need to specify the server name, the form ID, the submission ID, and the file name of the attachment, respectively. The filename should be URL-encoded if it contains spaces or special characters. You can also supply a private key (using POST instead of GET) if the attachment is encrypted. If the attachment is encrypted but you don’t specify a private key then the REST API will serve the encrypted attachment.
Date format
SurveyCTO servers run on UTC time. This means that data is stored and returned by the REST API in UTC time. This is unlike the behavior of direct downloads from the server console which localize date values according to the time and date settings of the exporting computer.
Data from select_multiple fields
Unlike data exported from the server console, or requested in CSV format via the API, select_multiple field data is not supplied as a space-separated list of values in JSON format. With JSON, data from the select_multiple field type is provided as binaries, with one variable per choice. Binaries will be provided only for the selections that were made in a given form submission (omitting non-selections).
Parallel requests
The API is rate-limited to one request per SurveyCTO server at one time. Continuous endpoint monitoring may offer some idea as to whether a request has been completed. Parallel requests will receive a 409 error:
{"error":{"code":409,"message"
Rate limiting
When requesting all submissions (date=0), API requests will be rate limited to one request per server each 300 seconds. If a second API request is made to the same SurveyCTO server within 300 seconds of another, a 417 error is returned:
{'error': {'message': 'Please wait for X seconds before retrying to pull all submissions for this form.', 'responseObject': None, 'code': 417}}
Sample code
While we do not officially supply or maintain example code, we aim to facilitate sample sharing through listing user-created self-hosted examples. If you have implemented API access to a SurveyCTO server in a programming language not listed below and would like to share, please write to support@surveycto.com.
Stata command for the SurveyCTO API
To make things easier for Stata users, we have developed a command for more easily returning form submissions and attachments from the server using the API. Read more here: Stata command to download data using the API.
Python library for the SurveyCTO API
For Python users, make use of the pysurveycto library to download data via the API. This library is the work of IDinsight.
Python 3 sample from IDinsight for encrypted use of the API:
https://github.com/IDinsight/
Please sign in to leave a comment.
Comments
0 comments