11. Data processing

Version 25.1 by tabermah on 2022-09-08 13:22


11.1. Automatic data processing

Unless you un-check the option for the specific acquisition run, autoprocessing will be executed. For standard rotational data autoprocessing is via XDSAPP (Sparta et al. (2016) J. Apple. Cryst. 49, 1085-1092) for strategy calculation we use Mosflm (Leslie Acta D62 (2006) 48-57) and Spotfinder (Zhang et al. (2006). J. Appl. Cryst. 39, 112–119) for producing the heatmap for the grid scans. The results are compiled into HTML-files to give you a convenient way of checking them. The output files are in the processed folders of the corresponding datasets.  A self refreshing index of all results is created in the beamtimes processed folder. During the beamtime you can open it from file::///gpfs/current/processed/index.html in a browser. The HTML-file is also contained in your data backup.


11.2. Manual data processing onsite

Please do not use the control computers of the beamline (haspp11user01-04) nor the haspp11eval01 for data processing (latter is now fully dedicated to autoprocessing).

Use Maxwell-cluster for data processing as described in the next chapter; on-site, you can directly ssh to username@desy-ps-cpu. Access to Maxwell requires a Scientific Account. 

Manual data processing after the beamtime or during a remote session is only available for registered DESY Scientific Accounts.


Accessing data from outside the DESY network:

Accessing the Maxwell cluster is easiest with Max-display:  connect to https://max-display.desy.de:3443 via web browser with the credentials of your scientific account.

Accesss via terminal window:
ssh username@desy-ps-ext.desy.de
ssh username@desy-ps-cpu

Give your password when prompted
Navigate to data by:
cd /asap3/petra3/gpfs/p11/2020/data/beamtimeID

Data is accessible to users who were registered participants to the experiment in Door prior to the beamtime ID being opened at the beamline. If you cannot access the data, please contact your LC or the beamline manager.

Our processing folders now contain two folders, 'full' and 'manual'. 'Manual' contains a template that has all the correct parameters and processing will run just by typing xds_par there. However, to shorten the processing time and make use of the computational structure, you should use a script to queue the jobs in slurm. A template script (xds.sh) can be found at /asap3/petra3/gpfs/common/p11/xds.sh

The partition in the script is defined psx for external users (ps should be used for internal users of Photon Science).
Copy the script to the folder where you run the processing and launch by 
sbatch xds.sh
It will find a free node for your job and run faster than just typing xds_par
The xds-log you get in a file xds-job-###.out

nxds can be used for processing Serial Crystallography data in a similar way than xds. As the computing time is long, nxds should should be used via a script to queue the jobs in slurm. The template script (nxds.sh) as well as an example of input file (nXDS.INP) can be found in /asap3/gpfs/common/p11/. 

Copy the script and the nXDS.INP to the folder where you run the processing. The nXDS.INP file needs some editing to fit your data path and data collection parameters (wavelength, detector distance, oscillation...). The processing is launch by
sbatch nxds.sh

List of software you can use is at:
https://confluence.desy.de/display/MXW/Photon+Science

You can store analysis results in scratch_cc (temporary/testing results) or processed (final results).

To use crystallographic programs:
module avail
module load xray       # sets the environment for xds and helper GUIs such as XDSAPP
module load ccp4/6.4
module load phenix