Millisecond Forums

API for Downloading

https://forums.millisecond.com/Topic32231.aspx

By loganap - 9/20/2021

Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.
By seandr - 9/20/2021

loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 

By loganap - 9/20/2021

seandr - 9/20/2021
loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 


Thank you, Sean. That might be just what I need! Can you point me to the documentation of this feature? I must have overlooked it in my searches.
By loganap - 9/20/2021

loganap - 9/20/2021
seandr - 9/20/2021
loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 


Thank you, Sean. That might be just what I need! Can you point me to the documentation of this feature? I must have overlooked it in my searches.

Ah, found it. In case anyone needs it: https://www.millisecond.com/support/docs/v5/html/viewer.htm#articles/dataweb.htm
This can be done via the <data> tag in your experiment configuration.

Thanks again for pointing me in the right direction.
By Dave - 9/20/2021

loganap - 9/20/2021
loganap - 9/20/2021
seandr - 9/20/2021
loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 


Thank you, Sean. That might be just what I need! Can you point me to the documentation of this feature? I must have overlooked it in my searches.

Ah, found it. In case anyone needs it: https://www.millisecond.com/support/docs/v5/html/viewer.htm#articles/dataweb.htm
This can be done via the <data> tag in your experiment configuration.

Thanks again for pointing me in the right direction.

There's another, somewhat more convenient and more flexible option than using the <data> element. In your web experiment's settings, under "Advanced Settings", you can enable the data forwarding option, select the forwarding protocol (options are: Multipart Form POST, HTTP/REST POST or PUT, and AWS S3 forwarding), enter your endpoint's address, as well as the access credentials required by the endpoint (if any).



Two major advantages of this: You don't need to change anything in your scripts, they can use the existing / default <data options, and you can optionally keep a copy on the Millisecond servers as a backup.
By loganap - 9/24/2021

Dave - 9/20/2021
loganap - 9/20/2021
loganap - 9/20/2021
seandr - 9/20/2021
loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 


Thank you, Sean. That might be just what I need! Can you point me to the documentation of this feature? I must have overlooked it in my searches.

Ah, found it. In case anyone needs it: https://www.millisecond.com/support/docs/v5/html/viewer.htm#articles/dataweb.htm
This can be done via the <data> tag in your experiment configuration.

Thanks again for pointing me in the right direction.

There's another, somewhat more convenient and more flexible option than using the <data> element. In your web experiment's settings, under "Advanced Settings", you can enable the data forwarding option, select the forwarding protocol (options are: Multipart Form POST, HTTP/REST POST or PUT, and AWS S3 forwarding), enter your endpoint's address, as well as the access credentials required by the endpoint (if any).



Two major advantages of this: You don't need to change anything in your scripts, they can use the existing / default <data options, and you can optionally keep a copy on the Millisecond servers as a backup.

Can anyone point to documentation on this feature? I'd like to configure our server to receive these requests but I'd like to see how the POST data will be sent. I didn't see it in the online docs. Thanks!
By Dave - 9/24/2021

loganap - 9/24/2021
Dave - 9/20/2021
loganap - 9/20/2021
loganap - 9/20/2021
seandr - 9/20/2021
loganap - 9/20/2021
Hi, my group routinely downloads bulk data files containing our Inquisit results from the millisecond.com online interface. I am responsible for maintaining a browser script that logs into the site and downloads items, but I've found that very minor updates to the site often break my script. It would be very helpful to have an API for more reliable access to our data files. Is there any plan to implement such an API? I found a request from a few years ago requesting this feature and I hoped it may have moved up the queue. https://www.millisecond.com/forums/Topic18845.aspx.

Request noted.  Can you say a bit more about how you are processing data and what function(s) would be useful to you?

Note that we already have an API for forwarding incoming data files as they arrive to an HTTPS endpoint of your choosing. When you enable this feature, Millisecond's data service forwards a copy of each data file and metadata (via HTTP PUT or POST) to your endpoint. In this case, you only need to program your endpoint to receive and process the data files that we send to it.

As for a downloading API, we can't really do a simple GET that returns files in bulk because many thousands of large files might be involved that could take hours to zip up (or merge into an Excel or CSV) and stream for download. Another approach would be a LIST function that returns a list of files in a specified folder, and a GET function that downloads a single file. You can then iterate over the list and download the data file by file. 

If you ask me, the data forwarding approach is more elegant for moving files between servers, if that's your ultimate goal. 


Thank you, Sean. That might be just what I need! Can you point me to the documentation of this feature? I must have overlooked it in my searches.

Ah, found it. In case anyone needs it: https://www.millisecond.com/support/docs/v5/html/viewer.htm#articles/dataweb.htm
This can be done via the <data> tag in your experiment configuration.

Thanks again for pointing me in the right direction.

There's another, somewhat more convenient and more flexible option than using the <data> element. In your web experiment's settings, under "Advanced Settings", you can enable the data forwarding option, select the forwarding protocol (options are: Multipart Form POST, HTTP/REST POST or PUT, and AWS S3 forwarding), enter your endpoint's address, as well as the access credentials required by the endpoint (if any).



Two major advantages of this: You don't need to change anything in your scripts, they can use the existing / default <data options, and you can optionally keep a copy on the Millisecond servers as a backup.

Can anyone point to documentation on this feature? I'd like to configure our server to receive these requests but I'd like to see how the POST data will be sent. I didn't see it in the online docs. Thanks!

I'll outline it here. Five metadata fields are forwarded as query parameters to the endpoint. Those are:

ScriptPath (the script file name)
GroupId (the group number)
SubjectId (the subject ID)
SessionId (the session ID)
FileKey (the data file path & name)

The POST message's body is the actual data file.

Below screenshot should illustrate:



In essence, this is also what you'll find outlined here https://www.millisecond.com/products/inquisit6/webhosting.aspx under "Hosting Data Upload Service."
By seandr - 9/24/2021

I've just updated the help topic on this, which was woefully out of date. It expands a bit on Dave's reply.

You can find details of various protocol options here:
https://www.millisecond.com/support/docs/v6/html/articles/dataweb.htm
By loganap - 9/27/2021

seandr - 9/24/2021
I've just updated the help topic on this, which was woefully out of date. It expands a bit on Dave's reply.

You can find details of various protocol options here:
https://www.millisecond.com/support/docs/v6/html/articles/dataweb.htm


Perfect, thanks
By AndrewPapale - 1/3/2022

Has anyone gotten this working with SharePoint or a RedCap database?

A related question, how does Millisecond back up its servers?  (May determine how frequently we pull data, e.g. every month vs every day).

Thanks,
Andrew
By tshanebuckley - 1/4/2022

AndrewPapale - 1/3/2022
Has anyone gotten this working with SharePoint or a RedCap database?

A related question, how does Millisecond back up its servers?  (May determine how frequently we pull data, e.g. every month vs every day).

Thanks,
Andrew

It looks like as it currently stands, neither of these options are possible. However, using the HTTP POST option is VERY close. 

Playing around with curl and Microsoft Graph Explorer, it seems like updating from the basic user:password authentication that millisecond provides to allowing a token-based authentication would allow for forwarding data directly to SharePoint (and likely other cloud providers).

Is there a way to request this update? Here is an example for a curl request for running this data forwarding based on the Inquisit, curl, and SharePoint documentation:

curl -X PUT <forwarding_url> \
-H 'Content-Type: gzipped' \
-H 'ScriptPath: <script_path>' \
-H 'SubjectId: <participant_id>' \
-H 'GroupId: <participant_group_number>' \
-H 'SessionId: <session_number>' \
-H 'FileKey: <name_generated_for_the_data_file>' \
-d '<inqusit_data>' \
-H "Authorization: Bearer <token>"


Example for the specific SharePoint document library use-case:
'https://graph.microsoft.com/v1.0/drives/<drive_id>/root:/<path_to_iqzip_file>:/content'
By seandr - 1/6/2022

Hi Shane,

This does appear feasible. Thanks for laying out the request format, that definitely helps. 

Is the use-case url at the end an example of a SharePoint document library endpoint? If so, presumably the <drive_id> parameter identifies your share, which you would provide as your endpoint, and the <path_to_iqzipfile> parameter represents the relative path within your drive where the data file would be saved. If I've got that right, we would set this to the FileKey, in which case the file/folder structure in your drive would mirror the structure on our servers. 

Also, would you be able to provide a SharePoint drive that we could use to test our implementation? We don't use SharePoint here, and having to set this up might bump the cost from a few hours to a few days. 

Thanks,
Sean


By tshanebuckley - 1/7/2022

seandr - 1/7/2022
Hi Shane,

This does appear feasible. Thanks for laying out the request format, that definitely helps. 

Is the use-case url at the end an example of a SharePoint document library endpoint? If so, presumably the <drive_id> parameter identifies your share, which you would provide as your endpoint, and the <path_to_iqzipfile> parameter represents the relative path within your drive where the data file would be saved. If I've got that right, we would set this to the FileKey, in which case the file/folder structure in your drive would mirror the structure on our servers. 

Also, would you be able to provide a SharePoint drive that we could use to test our implementation? We don't use SharePoint here, and having to set this up might bump the cost from a few hours to a few days. 

Thanks,
Sean



Your understanding <drive_id> and <path_to_iqzipfile> is correct. Thanks for looking into this thus far! I'll go set up a test SharePoint document library and message you the drive_id and an access token. I cannot give you direct access to the SharePoint username and password, but I can confirm the results on my end or work with you to set up an rclone instance of this SharePoint document library.

Going to PM you this now.
By seandr - 1/10/2022

Hi Shane,

I've just now published support for SharePoint. I haven't tested it against an actual share, but the protocol is pretty straightforward so this might just work out the box, or at least be very close . 

You can test it out yourself at any time. If there's an error, it will show up in the log for the experiment, which you can see here:
https://myaccount.millisecond.com/experimentlogs

If you get an error, it would be great if you sent the message to support@millisecond.com so we can fix it. 

Thanks,
Sean
By tshanebuckley - 1/11/2022

Hi Sean,

I set that up, just waiting for either a result or error. Does the API forwarding occur as data is collected or is there some type of schedule for forwarding data?

Also, would it be possible to add one more parameter for specifying where in a drive the data should go? Let's call it <relative_path> for example.

Would something like this be possible?:
'https://graph.microsoft.com/v1.0/drives/<drive_id>/root:/<relative_path>/<path_to_iqzip_file>:/content'

This way, there is some control over where the data is dropped into the SharePoint instead of just being at the root.

Best,
Shane
By seandr - 1/11/2022

Forwarding happens immediately when a new data file is uploaded to our server.

If forwarding for a given file fails (e.g. a token expires), the system will re-attempt to forward that file (and any other unforwarded files) the next time any new data from your experiment arrives at the server. We have plans to add UI so that you can manually re-attempt a forward operation, but that's a ways off. 

I've set this up so that it doesn't just dump files into the root folder. The relative path is the same as the path in our data store. For example, the relative path for raw data files is as follows:
<account name>/<experiment name>/<script name>/raw/<datafile name>

We've designed these paths to keep different types of data files separate. If you upload an experiment and run it, you can see the paths yourself by logging into https://myaccount.millisecond.com/datafiles

The SharePoint url as currently designed looks like the following:
https ://graph.microsoft.com/v1.0/drives//root:/<account name>/<experiment name>/<script name>/raw/<datafile name>:/content

That's the intention, anyway. Bugs are always possible, so if your testing shows something different, let me know.

Thanks,
Sean

By seandr - 1/11/2022

P.S. If the files aren't showing up in SharePoint and you aren't seeing any errors, I can take a look at look for clues on my end. I'd need the account and experiment names, which you can send to support@millisecond.com or via private message.
By tshanebuckley - 1/14/2022

seandr - 1/11/2022
P.S. If the files aren't showing up in SharePoint and you aren't seeing any errors, I can take a look at look for clues on my end. I'd need the account and experiment names, which you can send to support@millisecond.com or via private message.

Looks like this was not working. Attached is a text file with the report. I double-checked the settings with curl and they looked fine. Is it possible that the token was too long to be properly input?

All errors were the same:
returned status code Unauthorized: Unauthorized"    0


Another item I realized while looking into this more was that the SharePoint tokens have very short lifespans when retrieved from the Graph API Explorer. In inspecting my rclone configuration, I realized that rclone is smart enough to also store a refresh token and expiry of the access token. This makes me think that the PUT request option is not all that feasible. Here is an example of what the rclone configuration looks like:
[<rclone_name_for_remote_1>]
type = onedrive
token = {"access_token":"<access_token>","token_type":"Bearer","refresh_token":"<refresh_token>","expiry":"2022-01-14T04:28:30.731074-05:00"}
drive_id = <drive_id>
drive_type = documentLibrary

[<rclone_name_for_remote_2>]
type = onedrive
token = {"access_token":"<access_token>","token_type":"Bearer","refresh_token":"<refresh_token>","expiry":"2022-01-14T04:28:30.731074-05:00"}
drive_id = <drive_id>
drive_type = documentLibrary


If someone configured an rclone remote on their local machine, and then maybe copy and paste this as a part of the API Forwarding setup? From there, on the backend it could be as simple as:
rclone --config <project's_rclone_file_on_millisecond_server> copy <data_on_millisecond_server> <rclone_name_for_remote>:<where_data_should_be_saved_on_remote>


That would certainly be more complicated, but rclone could give you access to many cloud resources for data forwarding, listed here: https://rclone.org/#providers
By seandr - 1/14/2022

Is your token over 50 characters? If not, the length shouldn't be an issue. I've increased the max size to 100 just in case. 

I'll take a look at SharePoint APIs and see if I can figure out how this is supposed to be done. Whatever method we use will require stable credentials. 

-Sean
By tshanebuckley - 1/15/2022

seandr - 1/15/2022
Is your token over 50 characters? If not, the length shouldn't be an issue. I've increased the max size to 100 just in case. 

I'll take a look at SharePoint APIs and see if I can figure out how this is supposed to be done. Whatever method we use will require stable credentials. 

-Sean

Got a good laugh out of throwing this in python and getting the length. Looks like it's 2366 characters long. Thanks for looking into the SharePoint API. Feel free to reach out if I can be of any help.

Best,
Shane