Data Structure¶
The Data Tracker is based on a few main components:
Order
Dataset
Project
User
Log
DOI
Terminology¶
Fields:
Fields in the documents for the datatype/collection
Computed fields:
Values that are either calculated or retrieved from documents in other collection(s)
Included when the entity is requested via API
Order¶
Added automatically when e.g. order in order portal changes to
acceptedImport data from order portal
Can have any number of associated datasets
Fields¶
_idUuid for the order
TitleName
DescriptionDescription in markdown
Creator (facility name)
Creator can be set to e.g. external if a non-facility wants to add a dataset
ReceiverEmail or uuid of the user who made the order
Input to add will be an email address, which is mapped to the user collection
Uuid saved: user exists
Email saved: user does not exist
DatasetsAll datasets generated for the order
Extra fieldsCustom fields in the style
{'key': 'key_name', 'value': 'data value'}
Dataset¶
Data generated by e.g. facility
One per “data delivery” from facility
Can have identifier(s) (e.g. DOIs)
Data links can be added by
receiverReceiverandcreatorcan edit the entry (inherited fromorder)
Fields¶
_idUuid of the dataset
TitleName
DescriptionDescription in markdown
LinksList of links to where the dataset can be found.
List entry:
{ 'title': 'name', 'url': 'https://place', 'hashes': { 'type': 'sha256', 'files': [ { 'name': 'filename', 'hash': 'FEDCBA9...' } ] } }
titleandurlare mandatory for each link,hashesis optional
ExtraCustom fields in the style
{'key': 'key_name', 'value': 'data value'}
Computed fields¶
RelatedAll other datasets from the same order
ProjectsIdentifiersLocal identifier
DOIs
CreatorInherited from
orderName of e.g. facility that generated the dataset
Project¶
Created by users
Can have multiple owners
Can have identifiers
Intended as a way for a user to have a page to show off their data and be able to get an identifier (DOI)
Fields¶
_idUuid for the projects
TitleName
DescriptionDescription in markdown
ContactContact information (email) for the project
DatasetsDatasets connected to the dataset
Can be added by
receiver/creatorof datasetCan be removed by any user listed in
owners
PublicationsList of publications related to the project
Entry:
{ 'title': 'name', 'doi': 'doi-id' }
title+doimandatory, but maybe include ability to add e.g.journal,yearetc
DMPData management plan
URL
OwnersList of
uuids/emailsJust like with
order;emailcan be used if user not in db yetAllow facilities to prepare project pages
Extra fieldsCustom fields in the style
{'key': 'key_name', 'value': 'data value'}
Computed fields:¶
IdentifiersLocal identifier
DOIs
User¶
Everyone using the system is a user
Login via Elixir AAI
On first login, the user will be added to db
Use
auth_idto recognize userRead e.g.
emailfrom the login info
API can also be accessed using an API key
may be created by any user
“admin” can create user for facility
A user can “claim entries”
Will check all order
receivers/projectownerswhether the users email is listedEmailwill be replaced with useruuid
Facilities cannot log in via Elixir, but must do so via
api_key
Fields¶
_idUuid for the order
EmailEmail address of the user
Auth_idIdentifer received from Elixir
Is set to
--facility--for facilities to avoid Elixir login
Api_keyKey that can be used as an alternative to login for authentication
NameName of the user (can be e.g. name of facility for facility accounts)
AffiliationUniversity/company etc
CountryThe country of the user
PermissionsA list of the extra permissions the user has (see Permissions)
Log¶
Whenever an entry (
order,dataset,project, oruser) is changed, a log should be writtenAll logs are in the same collection
A function is required to show changes between different versions of an entry
Fields¶
_iduuidfor the log
`Action’
Type of action (add, edit, or delete)
Data_typeThe collection that was modified (
order,dataset,project, oruser)
DataAdd/edit: complete copy of document
Delete: empty
TimestampThe time the action was performed
UserUuidof the user performing the action
DOI¶
Two collections
doi_req- Requests for a DOIdoi- Accepted DOIs
Users can request a DOI for datasets and projects
Upon request, data is copied to
doi_reqA reviewer will need to check the data for the request
Required fields
File hashes
If accepted, the data will be copied to
doiEach DOI document is a complete copy of the entire data structure that was accepted for the DOI
Fields (request)¶
_idUuidfor the request
DataA complete copy of all relevant data
A project with associated datasets will include copies of the datasets in
datasetsinstead of onlyuuids
StatusRequested,Accepted,Rejected
UserUser that made the request
UpdatesMini log system
{ 'timestamp': <current time>, 'new_status': 'new_status' }
Typedatasetorproject
CommentsComments from the reviewer
Computed Fields (request)¶
Other_requestsOther requests that have been made for the same entry
To allow the reviewer to see e.g. earlier comments
Fields (doi entry)¶
_idThe DOI identifier
timestampWhen the entry was created
DataThe complete entry that has been accepted
Other topics¶
Identifiers¶
Only uuid initially
Can request a “fancier” local identifer for
dataset/projectStyle similar to:
scilifelab.facility.orderxyz.dataset1scilifelab.projects.title1
All datasets and projects can request DOI
The required fields will be checked if empty. If not the request will be sent for evaluation by e.g. admin
Permissions¶
“Permission classes” used to evaluate what a user may do
CREATE_ORDERSMANAGE_USERSEDIT_ANY_DATAREAD_OWNERSDOI_REVIEWER
“Default groups”
Template for user, giving a specific set of permissions
Admin - “all”
Facility - “create orders”+”read ownerships”