r34 - 05 Mar 2019 - 13:01:42 - LaurentDuflotYou are here: TWiki >  Atlas Web  >  Informatique > DataAccess > InformatiqueDDM

Distributed Data Management at CCIN2P3

The Distributed Data Management group is in charge of the system to manage access to ATLAS data that is distributed at sites all around the world. The system consists of a bookkeeping system (dataset-based) and a set of local site services to handle data transfers, building upon Grid technologies. The software stack is called DQ2. These pages are dedicated to the use of DDM at CCIN2P3. Most of general information can be found in

Files in the Grid can be referred to by different names: Grid Unique IDentifier (GUID), Logical File Name (LFN), Storage URL (SURL) and Transport URL (TURL). While the GUIDs and LFNs identify a file irrespective of its location, the SURLs and TURLs contain information about where a physical replica is located, and how it can be accessed.

1. Set up of DDM

With CVMFS, the same set of commands can be used at various sites, including CCIN2P3 and CERN

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh
lsetup rucio
export DQ2_LOCAL_SITE_ID=IN2P3-CC_SCRATCHDISK

the last commands tells rucio where you run from to optimise data transferts. You need a grid proxy created by voms-proxy-init to use the DDM commands.

2. dCache at CCIN2P3

dCache is a sophisticated system which is capable of managing the storage and exchange of hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes or magnetic tape drives in hierarchical storage managers (hsms). It is jointly developed by DESY and Fermilab. A large dCache system has been deployed into CCIN2P3 production system for the benefits of ATLAS users at CCIN2P3. CCIN2P3 dCache system works as a disk caching system as a front end for Mass Storage System --- HPSS system.

3. Which data are located at the CCIN2P3

The grid space is divided into space tokens with specific access rights, e.g. DATADISK is for official ATLAS-wide dataset from production. The users can write on the SCRATHDISK (with a limited lifetime of data), LOCALGROUPDISK from their cloud or specific performance/physics token with special privileges (e.g. PHYSHIGGS).

List the datasets available on the target disk space can be obtained by rucio:

rucio list-datasets-rse  IN2P3-CC_LOCALGROUPDISK

4. How to replicate data

The most general use case is that you have send your analysis on grid. Jobs have been send on a certain site and output saved there on a SCRATCHDISK with a limited lifetime, see these informations. It is thus necessary to replicate them on the Tier 3 site. The replication can be requested at submission time with pathena/prun or requested later:
  • freeze your dataset
  • go on Data Transfer Request Interface ( R2D2 (was: DATRI) and follow instructions for automatic transfer

For archiving purpose, users can copy datasets to LOCALGROUPTAPE but please make sure there are no small files in the dataset (e.g. logfiles) as this does not work well with tapes. Transfers to LOCALGROUPTAPE are subject to approval by the cloud support.

5. How to access data at CCIN2P3

5.1 List of "files"/url from a dataset (container)

  • rucio list-file-replicas --rse IN2P3-CC_LOCALGROUPDISK --protocol root mydataset

Please change the server name from ccxrdatlas to ccxrootdatlas in order to use the internal xrootd servers at CC instead of the remote access door which is limited.

5.2 Direct access in Athena

You can access directly files in your JO with xrootd :

EventSelector.InputCollections = ["root://ccxrootdatlas.in2p3.fr:1094/pnfs/in2p3.fr/data/atlas/datafiles/csc11/recon/csc11.005200.T1_McAtNlo_Jimmy.recon.AOD.v11004103/csc11.005200.T1_McAtNlo_Jimmy.recon.AOD.v11004103._00111.pool.root.1"]

5.3 Direct access in ROOT

After setup a recent version of root, e.g. with
lsetup root

One can access the file via xrootd:

TFile* xrfile = TFile::Open("root://ccxrootdatlas.in2p3.fr:1094/pnfs/in2p3.fr/data/atlas/atlaslocalgroupdisk/mc11_7TeV/NTUP_SMWZ/e861_s1310_s1300_r3043_r2993_p833/mc11_7TeV.145002.Pythia6_DYmumu_120M250.merge.NTUP_SMWZ.e861_s1310_s1300_r3043_r2993_p833_tid653896_00/NTUP_SMWZ.653896._000001.root.1")

5.4 Local copy

You can copy locally a file on your interactive machine or inside your batch job :

xrdcp root://ccxrootdatlas.in2p3.fr:1094/pnfs/in2p3.fr/data/atlas/atlaslocalgroupdisk/mc11_7TeV/NTUP_SMWZ/e861_s1310_s1300_r3043_r2993_p833/mc11_7TeV.145002.Pythia6_DYmumu_120M250.merge.NTUP_SMWZ.e861_s1310_s1300_r3043_r2993_p833_tid653896_00/NTUP_SMWZ.653896._000001.root.1 mylocal_file

5.5 Check file accessibility in CC

If you think that a file is not accessible (lost or not), here is a list of points to check before contacting the site administrator. Commands refer to the previous chapter
  • Check the existence of the file : Use xrdcp to copy the file locally
  • Check the accessibilty of the file with root: Open the file with root commands but check that you defined the same ROOT version as in your Athena program.

Older versions of this page

-- LaurentDuflot - 15 Nov 2013 Cleanup old info, links and make sure examples work (athena access not tested)

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r34 < r33 < r32 < r31 < r30 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback