This lecture illustrates what iRODS is and how you can manage data with iRODS as a user. To this end we wil make use of the icammands.
- A user account on an iRODS 4.1.X system
- icommands client, Installation
The whole tutorial will guide you through the workflow indicated in the figure below. This part is about Step 1 ingesting data and administering data in iRODS via the icommands. You will have the role as an iRODS user. All commands shown in this part are either icommands or shell commands.
First we connect to an iRODS server and authenticate as iRODS user. The user account has to be created by the iRODS admin beforehand.
iinit
When you connect for the first time, you will receive this answer:
One or more fields in your iRODS environment file (irods_environment.json) are
missing; please enter them.
Usually iinit uses the irods_environment.json to retrieve information to which iRODS instance to connect to and which user to use. If the file is incomplete or has not yet been generated you will have to provide this information:
Enter the host name (DNS) of the server to connect to: <ip adrdress or fully qualified hostname>
Enter the port number: 1247
Enter your irods user name: <irodsuser>
Enter your irods zone: <zonename>
The prot numer is standard 1247. The zone name you have been provided when receiving your username and password.
iRODS zone: always contains exsctly one so-called iCAT catalogue, which is a database containing user information, the mapping from physical storage to iRODS logical path for data and hosts metadata attached to data. Resources: Software or Hardware system that stores data. The iRODS system abstracts from the hardware and software so that you, as a user, can put data into certain resources without specific knowledge on the protocols to use. iRODS collections: As a user you have access to a collection, just as a home dorectory in a linux system. In this collection you can create subcollections and store data. You can retrieve and store data and collections by using the iRODS (virtual) path. The iCAT catalogue will take care of the mapping to the actual physical path.
With the following command you can retrieve some information on the iRODS system you are working on:
env
You will see an answer of the system similar to the one below:
NOTICE: Release Version = rods4.1.6, API Version = d
NOTICE: irods_session_environment_file - /home/irodsadmin/.irods/irods_environment.json.2440
NOTICE: irods_user_name - irods
…
NOTICE: irods_zone_name – aliceZone
NOTICE: created irodsHome=/aliceZone/home/irods
NOTICE: created irodsCwd=/aliceZone/home/irods
To see which physical resources are attached to the iRODS instance and what their logical names are, you can use:
ilsresc –l
which will yield:
resource name: demoResc
id: 9101
zone: aliceZone
type: unixfilesystem
class: cache
location: iRODS4.alice
vault: /irodsVault
This tells us that there is only one resource defined which is of type unix file system. The value after vault tells us where our data will be stored physically on the iRODS server. location tells us the hostaname of the iRODS server. Check with:
hostname
on your shell.
The most important icommand will be:
ihelp
This will print out all commands the client knows.
Let's have a look at our current iRODS working directory and list the content.
ils
Since we have not put any data yet, you will receive this as an answer from the system:
/alicetestZone/home/alice:
Let's create a subcollection. You may already know the UNIX command mkdir. iRODS uses a similar command and syntax for creating collections:
imkdir testData
We can remove a file and collection by
irm -r testData
Let's create a file and put it into iRODS
echo "test content" > put1.txt
This creates a test file put1.txt with test content. You can store into iRODS and by this transferring it to the iRODS server (if you are working on a remote computer).
iput put1.txt
We can list the content again, when using the option -L we can also see where iRODS stored the file physically on the server.
ils -L
iRODS will give us the user, which resource it is stored on and as last information the physical path on the iRODS server.
/aliceZone/home/alice:
alice 0 demoResc 13 2016-02-19.13:35 & put1.txt generic /irodsVault/home/alice/put1.txt
iput comes with some useful options.
iput -K put1.txt
Will store a checksum with your file. If you now execute this command iRODS will complain that the file already exists. With the -f option you can force iRODS to overwrite existing data.
iput -K -f put1.txt
If you now list the collection content again with the -L option you caninspect the md5 checksum. You can also specify which subcollection and which resource iRODS should use to store the data.
iput -K -f put1.txt -R demoResc testData
will store the data physically on demoRes and use /alicetestZone/home/alice/testData as logical path.
To retrieve data from iRODS you can use
iget -K -P -f put1.txt
The option -K tells iRODS to verify the chacksum on the fly. To list all options for the command use
iput -h
To remove our put1.txt from iRODS use
irm put1.txt
Let's inspect what happens. If we list the content of our current working collection, we will not find the file, so it seems to be deleted. However, inspecting the trash folder, shows that only the file's physical and logical path was changed. This is what we call a soft delete.
ils -L /aliceZone/trash/home/alice
/aliceZone/trash/home/alice:
alice 0 demoResc 13 2016-02-19.13:52 & put1.txt
d6eb32081c822ed572b70567826d9d9d generic /irodsVault/trash/home/alice/put1.txt
C- /aliceZone/trash/home/alice/Data
That means you can restore the file with the follwing commands.
imv /eveZone/trash/home/eve/testData/put1.txt /eveZone/home/eve/testData
imv can be used to move data and subcollections and to rename them.
imv /eveZone/home/eve/testData/put1.txt /eveZone/home/eve/testData/put2.txt
To remove the file completly from the system, you need to execute
irmtrash
This is called a hard delete. Now the file is removed from the system and from the iCAT catalogue.
With the option -A we can list the accession control list of files and collections.
ils -A
/aliceZone/home/alice:
ACL - alice#aliceZone:own
Inheritance – Disabled
This tells us that /home/alice is only visible by the user alice and the irodsadmin, who has access to all data by default.
Let's create a subcollection, out some data into it and open the collection
imkdir DataCollection
ichmod inherit DataCollection
ichmod ichmod read bobe DataCollection
With ichmod inherit we assure that all data and subcollections in DataCollection will inherit their ACL from the parent collection. After that we grant read-access to another user in the iRODS system. Check the ACL settings of the collection.
ils -A DataCollection
Now we put some data into the collection.
iput -K put1.txt DataCollection
put1.txt ingertited the ACLs from its parent collection. Note that when you change the ACLs of the parent collection, the ACLs of all files and subcollections are not automatically updated!
Our user bob can now list the collection and read put1.txt.
bob@irods4:~$ ils /aliceZone/home/alice/DataCollection
/aliceZone/home/alice/DataCollection:
put1.txt
Important: When giving access to files and subcollections, the parent collection needs also to be read or writable. Copy put1.txt to your home collection and give access to another user.
icp -f DataCollection/put1.txt put1.txt
ichmod read bob put1.txt
If user bob now tries to list or retrieve put1.txt in our home collection he will receive the follwing error, although the ACLs on the file itself have been set correctly.
ERROR: lsUtil: srcPath /aliceZone/home/alice/put1.txt does not exist or user lacks access permission
This is due to the fact, that bob has no read rights on the parent collection /aliceZone/home/
iRODS provides the user with the possibility to create Attribute Value Unit triplets and store them with the data. The triplets are stored in the iCAT catalogue. By that you can add extra information on the data and collections.
We can store information e.g. on the creation data of a file
imeta add -d put1.txt "Date" "Nov 2015"
Here we created an attribute with the name Date and gave it the value Nov 2015. The unit section is left empty.
We can also annotate a collection
imeta add -C DataCollection "Type" "Collection"
To list AVUs stored for a file use:
imeta ls -d put1.txt
The imeta command also allows us to define simple queries.
imeta qu -d Date = "Nov 2015"
For more sophisticated sql-like queries we can use
iquest "select sum(DATA_SIZE), COLL_NAME where COLL_NAME like '/alicetestZone/home/alice/Data%'"
This command sums the sizes of all data in each collection starting with Data. The command iquest already knows some keywords specific to the iRODS environment. You can list them with
iquest attrs