Data Tracker¶
API¶
Version 1¶
Base URL for the API is <url>/api/v1/
. All API description have the base implied before the first /
.
Order¶
Note
Only for users with ORDERS
or DATA_MANAGEMENT
.
-
/order/
- GET
Get a list of all orders where the user is
editor
.All orders will be listed for a user with
DATA_MANAGEMENT
.
- POST
Add a new order.
Returns the
uuid
of the added order.
-
/order/<uuid>/
- GET
Get information about the order
uuid
.
- DELETE
Delete the order
uuid
.
- PATCH
Update the order
uuid
.
-
/order/<uuid>/dataset/
- POST
Add a new dataset for the order
uuid
.Returns the
uuid
of the added dataset.
-
/order/<uuid>/log/
- GET
Get a list of changes for the order
uuid
.
Dataset¶
-
/dataset/
- GET
Get a list of all datasets.
-
/dataset/<uuid>/
- GET
Get information about the dataset
uuid
.
- DELETE
Delete the dataset
uuid
.
- PATCH
Update the dataset
uuid
.
-
/dataset/<uuid>/log/
- GET
Get a list of changes done to the dataset
uuid
.
Collection¶
-
/collection/
- GET
Get a list of all collections.
- POST
Add a new collection.
-
/collection/<uuid>/
- GET
Get information about the collection
uuid
.
- DELETE
Delete the collection
uuid
.
- PATCH
Update the collection
uuid
.
-
/collection/<uuid>/log/
- GET
Get a list of changes done to the collection
uuid
.
User¶
Current User¶
-
/user/me/
- GET
Get information about the current user.
- PATCH
Update information for the current user.
-
/user/me/apikey/
- POST
Generate a new API key for the current user.
The new API key is returned.
-
/user/me/log/
- GET
Get a list of changes done to the current user.
-
/user/me/actions/
- GET
Get a list of changes done by the current user.
-
/user/me/orders/
- GET
Get a list of orders where the current user is listed as
editor
.
-
/user/me/datasets/
- GET
Get a list of datasets where the current user is listed as
editor
.
-
/user/me/collections/
- GET
Get a list of collections where the current user is listed as
editor
.
Look Up Users¶
Note
Only for users with USER_MANAGEMENT
, or in some cases USER_SEARCH
.
-
/user/
Note
Only for users with
USER_SEARCH
orUSER_MANAGEMENT
.- GET
Get a list of all users.
Users with
USER_SEARCH
will get a limited set of fields.
- POST
Add a new user.
-
/user/<uuid>/
- GET
Get information about the user
uuid
.
- PATCH
Update information about the user
uuid
.
- DELETE
Delete the user
uuid
.
-
/user/<uuid>/apikey/
- POST
Generate a new API key for the user
uuid
.The new API key is returned.
-
/user/<uuid>/log/
- GET
Get a list of changes done to the user
uuid
.
-
/user/<uuid>/actions/
- GET
Get a list of changes done by the user with
uuid
.
-
/user/<uuid>/orders/
- GET
Get a list of orders where the user
uuid
is listed aseditor
.
-
/user/<uuid>/datasets/
- GET
Get a list of datasets where the user
uuid
is listed aseditor
.
-
/user/<uuid>/collections/
- GET
Get a list of collections where the user
uuid
is listed aseditor
.
Log In/Log Out¶
-
/logout/
- GET
Log out the current user.
-
/login/oidc/<auth_name>/login/
- GET
Log in using OpenID Connect (e.g. Elixir AAI) for service
auth_name
.
-
/login/oidc/<auth_name>/authorize/
- GET
Authorize using OpenID Connect (e.g. Elixir AAI) for service
auth_name
(vialogin
).
-
/login/apikey/
- GET
Log in using
auth_id
+api_key
.
Data Structure¶
The Data Tracker is based on a few main components:
Order
Dataset
Collection
User
Log
General¶
Title
may never be empty.
Terminology¶
Fields:
Fields in the documents for the datatype/collection.
Computed fields:
Values that are either calculated or retrieved from documents in other collection(s).
Included when the entity is requested via API.
Order¶
Requires special permission to add (
ORDERS_SPECIAL
)May only be accessed and modified by users listed in
editors
or users withDATA_MANAGEMENT
.Can have any number of associated datasets.
Deleting an order will delete all owned datasets.
Summary¶
Field |
Description |
Default |
Public |
---|---|---|---|
_id |
UUID of the Entry |
Set by system |
Hidden |
title |
Title of the Entry |
Must be non-empty |
Hidden |
description |
Description in markdown |
Empty |
Hidden |
generators |
List of users who generated data |
Entry creator |
Visible (via dataset) |
authors |
List of users responsible for e.g. samples (e.g PI) |
Entry creator |
Visible (via dataset) |
organisation |
User who is data controller |
Entry creator |
Visible (via dataset) |
editors |
List of users who can edit the order and datasets |
Entry creator |
Hidden |
datasets |
List of associated datasets |
Empty |
Visible (via dataset) |
tags_standard |
Tags defined in the system |
Empty |
Hidden |
tags_user |
Tags defined by the users |
Empty |
Hidden |
Fields¶
- _id
UUID of the entry.
Set by the system upon entry creation, never modified.
- title
Entry title.
Must be non-empty.
- description
Entry description.
May use markdown for formatting.
Default: Empty
- generators
List of
users
.Corresponds to e.g. the facility or people generating the data (from samples).
May be shown openly on all associated datasets.
Access may be limited by other settings.
Default: The user that created the entry.
- authors
List of
users
.Corresponds to e.g. the researcher who leads the project the samples came from.
May be shown openly on all associated datasets.
Access may be limited by other settings.
Default: The user that created the entry.
- organisation
A single
user
who is the data controller for the datasets generated from the order (e.g. a University).Default: The user that created the entry.
- editors
List of
users
.Users that may edit the order and dataset entries. May add datasets to an order.
Default: The user that created the entry.
- datasets
List of datasets associated to the order.
Cannot be modified directly but must be modified through specialised means.
Default: Empty
- tags_standard
A standard set of tags that are defined by the system.
Default: Empty
- tags_user
User-defined tags for the system.
Default: Empty
Dataset¶
Dataset generated by e.g. facility.
A dataset must be associated with one order.
Multiple datasets may be associated with the same order.
The association to a specific order cannot be changed.
Once associated with an order, it will stay so.
Can have identifier(s) (e.g. DOIs).
Will use some fields from its order:
generators
authors
organisation
editors
Summary¶
Field |
Description |
Default |
Public |
---|---|---|---|
_id |
UUID of the Entry |
Set by system |
Visible |
title |
Title of the Entry |
Must be non-empty |
Visible |
description |
Description in markdown |
Empty |
Visible |
tags_standard |
Tags defined in the system |
Empty |
Visible |
tags_user |
Tags defined by the users |
Empty |
Visible |
cross_references |
External identifiers, links etc. |
Empty |
Visible |
Fields¶
- _id
UUID of the entry.
Set by the system upon entry creation, never modified.
- title
Entry title.
Must be non-empty.
- description
Entry description.
May use markdown for formatting.
Default: Empty
- tags_standard
A standard set of tags that are defined by the system.
Default: Empty
- tags_user
User-defined tags for the system.
Default: Empty
- cross_references
External references to the data.
E.g. DOIs or database IDs.
Default: Empty
Computed fields¶
- related
datasets
from order, except the current dataset.
- collections
List of collections containing the current dataset in
datasets
.
- generators
generators
from order.
- authors
authors
from order.
- organisation
organisation
from order.
Collection¶
May be created by any users.
Can have multiple
editors
.Can have identifiers.
Provides a way of grouping datasets before publication.
Should aid requesting a DOI from Figshare for the collection.
Summary¶
Field |
Description |
Default |
Public |
---|---|---|---|
_id |
UUID of the Entry |
Set by system |
Visible |
title |
Title of the Entry |
Must be non-empty |
Visible |
datasets |
The associated datasets |
Empty |
Visible |
description |
Description in markdown |
Empty |
Visible |
tags_standard |
Tags defined in the system |
Empty |
Visible |
tags_user |
Tags defined by the users |
Empty |
Visible |
cross_references |
External identifiers, links etc. |
Empty |
Visible |
editors |
List of users who can edit the collection |
Entry creator |
Hidden |
Fields¶
- _id
UUID of the collection.
Set by the system upon entry creation, never modified.
- title
Entry title.
Must be non-empty.
- description
Entry description.
May use markdown for formatting.
Default: Empty
- tags_standard
A standard set of tags that are defined by the system.
Default: Empty
- tags_user
User-defined tags for the system.
Default: Empty
- cross_references
External references to the data.
E.g. DOIs or database IDs.
Default: Empty
- editors
List of
users
.Users that may edit the collection.
May add datasets to an order.
Default: The user that created the entry.
User¶
Everyone using the system is a user.
Including facilities, organisations …
Login via e.g. Elixir AAI or API key.
On first login, the user will be added to db.
API can also be accessed using an API key.
API key may be generated by any user.
A user with the permission
USER_MANAGEMENT
can create and modify users.A user with the permission
ORDER_USERS
can create and modify “partial” users.
Summary¶
Field |
Description |
Default |
Public |
---|---|---|---|
_id |
UUID of the Entry |
Set by system |
Hidden |
affiliation |
User affiliation (e.g. university) |
Empty |
Visible |
api_key |
Hash for the API key |
Empty |
Hidden |
api_salt |
Salt for API api_key |
Empty |
Hidden |
auth_ids |
List of identfiers from e.g. Elixir |
Empty |
Hidden |
Email address for the user |
Must be non-empty |
Hidden |
|
email_public |
Email address to show publicly |
Empty |
Visible |
name |
Name of the user |
Must be non-empty |
Visible |
orcid |
ORCID of the user |
Empty |
Visible |
permissions |
List of permissions for the user |
Empty |
Hidden |
url |
URL to e.g. homepage |
Empty |
Visible |
Fields¶
- _id
UUID of the entry.
Set by the system upon entry creation, never modified.
- affiliation
Affiliation of the user.
- api_key
Hash for the API key for authorization to API or login.
- api_salt
Salt for the API key.
- auth_ids
List of identifiers used by e.g. Elixir AAI.
Saved as strings.
The general form is
email@location.suffix::source
, but the style may vary between sources.Any of the auth_id can be used with the API key.
Email address for the user.
Default: Must be set
- email_public
Email to show to public on e.g. generated datasets.
Default: Empty.
- name
Name of the user.
Could also be name of e.g. facility or university.
- orcid
ORCID of the user.
- permissions
A list of the extra permissions the user has (see Permissions).
- url
Url to e.g. a homepage
If set, it must start with
http://
orhttps://
.Default: Empty
Log¶
Whenever an entry (
order
,dataset
,collection
, oruser
) is changed, a log should be written.Only visible to entry owners and admins.
All logs are in the same collection.
The log needs parsing to show changes between different versions of an entry.
A full cope of the new entry is saved.
In case of deletion,
_id
is saved asdata
.
Summary¶
Field |
Description |
Default |
---|---|---|
_id |
UUID of the Entry |
Set by system |
action |
type of action |
Must be non-empty |
comment |
Short description of the action |
Empty |
data_type |
The modified collection (e.g. order) |
Must be non-empty |
data |
Complete copy of the new entry |
Must be non-empty |
timestamp |
Timestamp for the change |
Must be non-empty |
user |
UUID for the user who performed the action |
Must be non-empty |
Fields¶
- _id
UUID of the entry.
Set by the system upon entry creation, never modified.
- action
Type of action
Add
Edit
Delete
- comment
Short description of why it was made
“Add Dataset from order
X
”.
- data_type
The collection that was modified, e.g.
order
- data
Add/edit: full copy of the new/updated document.
Delete: the
_id
of the document.
- timestamp
The time the action was performed.
- user
_id
of the user that performed the action.Can be set to
system
for automated actions (e.g. creating a user after OIDC login)
Implementation¶
Permissions¶
Permissions are managed by topics.
A user may have multiple topics.
The topics are defined in
user.py
.The topics are defined as a dict:
{ 'ENTRY': ('ENTRY', 'ENTRY2'), ... }
Each topic is defined as key, and any other topics that are considered to cover the same task is included as value. - Allows the use a single topic to require permission for an API endpoint.
permission_required
is used to check whether a user has the required permission. - It is not defined as a decorator, as it may sometimes need to coexist with an ownership check. - At the beginning of a request, run e.g.user.permission_required('OWNERS_READ')
.
Current units¶
- LOGGED_IN
Task require a logged in user (e.g. show user info). Use the decorator
user.login_required
.- DATA_MANAGEMENT
May modify any order, dataset, or project. Includes
ORDERS
andOWNERS_READ
.- ORDERS
May create, edit, and delete orders if listed as an editor for the order. Includes USER_ADD and USER_SEARCH.
- OWNERS_READ
May access all entity owner information.
- USER_ADD
May add users.
- USER_SEARCH
May list and search for users.
- USER_MANAGEMENT
May modify any user. Includes USER_ADD and USER_SEARCH.
CSRF¶
A csrf cookie with the name _csrf_token
is set the first time a request is made to the system. It must be included with the header X-CSRFToken
for any non-GET
request.
All cookies are deleted upon logout.
Testing¶
All tests are available at backend/tests
.
API Keys¶
The keys are generated using secrets.token_hex(48)
.
Include a 8-byte randomized salt when calculating hash.
Store the token using hashlib.sha512(token).hexdigest()
.
Development¶
System for development¶
Prepare config file¶
Prepare a config.yaml
file. Just renaming config.yaml.sample
to config.yaml
and setting the two dev
variables to true should be enough.
Build and activate the containers¶
docker-compose up
The system can be accessed in a web browser at localhost:5000
.
Add test data¶
Set a virtual Python environment, install modules.
python -m venv venv
. venv/bin/activate
pip install -r test/requirements.txt
pip install -r backend/requirements.txt
Randomized test data can be generated by test/gen_test_db.py
. Run it using e.g.:
PYTHONPATH=backend python3 test/gen_test_db.py