Web UI could display a checksum + upload date for source dataset / created date for materialized views)
I
Itay Plavin
1) Convenient means of verifying data integrity to a repo creator without having to re-download the file
2) A public dataset should be easily provable to be identical to the source (especially where the source has a readily available MD5 / SHA256 hash associated w/ the data)
Stretch goal for point #2- during upload, allow the uploader to specify a URL linking to the MD5 / SHA256 checksum of the source data (presumably from the same domain name) that is embedded in a page, as well as a CSS selector to grab the hash from the page's HTML. Upon upload, bit.io could then issue an API call to https://archive.org to snapshot the page and link the snapshot in the repo along with the extracted checksum. It's an advanced workflow but would provide a future-proof way to verify that the original dataset came from the source of the MD5 hash.
Madelaine Boyd
Cool idea, thanks!
Versioning and snapshots are something we'd like to build soon. Showing the MD5/SHA of a dataset seems reasonable.