Data Set sharing and access-control
Share
The basic principle of the Project - Data Set abstraction is that no data can be copied/moved outside of a
project’s scope, making a project effectively a sandbox for data and programs. However, Data Sets can be shared with one
or more other projects and only data owners have privileges to share Data Sets. To share a Data Set, navigate to
the Data Sets
browser from the services menu on the left, right-click on the Data Set to be shared and from the
Share with
option select either project
or cluster
.
In the case of project
, a popup dialog will then prompt users to select a target project with which the Data
Set is to be shared with. It is also feasible to filter projects by name as shown in the image below.
When sharing a Data Set you can also choose share permissions. Available permissions are shown in the image below.
- Everyone can read: gives read only access to all members of the project the Data Set is shared with.
- Everyone can edit: gives read/write access to all members of the project the Data Set is shared with.
- All Data owners can edit: gives data owners write access and read only access to all other members of the project the Data Set is shared with.
To complete the sharing process, a Data Owner in the target project has to click on the shared Data Set,
and then click on Accept
to complete the process, Reject
to reject the shared Data Set or cancel
for no
action.
Unshare
Data Sets can be unshared
by data owners, either from the parent project or from the project they have been
shared with. To do that, users of the parent project can right-click on the Data Set, click UnShare from
and then
select the project to unshare from. Users of the “shared-with” project, need to right-click on the shared Data Set
and then click Remove DataSet
.
Changing share permission
To change permission of a shared Data Set data owners can right-click on the Data Set and click Share permission
. This will open a popup
dialog with a table listing all projects the Data Set is shared with.
From the shared with dialog you can change share permission or unshare a Data Set from a project.
Data Set access-control
Inviting other users to a project and sharing a Data Set are ways to allow members of an organization to provide
fine-grained access control to original data to other members/departments without having to relinquish full control
over the data. Essentially that means that data organized in Data Sets can be shared with other projects so that
other users of Hopsworks can access and make use of this data in their AI pipelines and programs.
By default, all members of the project the Data Set was created in, are allowed only read-only
access to all files and directories of the Data Set. There are two more access
levels that give higher privileges to members that can access this Data Set. All access-levels are:
- Everyone can read: gives read only access to all members of the project the Data Set is shared with.
- Everyone can edit: gives read/write access to all members of the project the Data Set is shared with.
- All Data owners can edit: gives data owners write access and read only access to all other members of the project the Data Set is shared with.
Changing the access level can be done with a right-click on the Data Set, permissions and click on one of the
available access levels as shown in the figure below.
To check the current access levels select a Data Set and the permission will be shown on the sidebar as shown in the figure below.