DEMO - Working with FLASH data

Last modified by sndueste on 2025/02/05 14:41

Experimental data is recorded as HDF files[link] on the GPFS file system[link]. The access rights are linked to the user's DESY account and can be managed by the PI via the GAMMA portal[link]. The experimental data can be downloaded via the GAMMA portal, but it is advised to use the DESY computing infrastructure. Access point are via ssh, Maxwell-Display Server[link] or JuyterHub[link]. We recommend using the JupyterHub for data exploration and the SLURM resources for high performances computing - see FAB for easy usage of the infrastructure. 

How to login JupyterHub

There are different options that help you to work with the FLASH HDF5 data in Python

See also the collection of Demo data and sample scripts : Collection of HDF5 sample data from different beamlines

image2023-9-29_11-1-37.png

older ideas ...

(object oriented) https://gitlab.desy.de/christopher.passow/fdh-builder


TODO

Unknown macro: task. Click on this message for details.

Unknown macro: task. Click on this message for details.

Unknown macro: task. Click on this message for details.

Unknown macro: task. Click on this message for details.

Unknown macro: task. Click on this message for details.


under review

conda create -n flashh5 python=3.10  # 3.10 not necessary, but would prefer 3.8 or higher
source activate flashh5
conda install ipython numpy pandas  #TODO: fix dependcies
conda install -c https://www.desy.de/~cpassow/condarepo/ flashh5

## on jhub
conda install ipykernel
python -m ipykernel install --user --name flashh5 --display-name "flashh5"


## to remove on jhub
## delete from: /home/$USER/.local/share/jupyter/kernels/
moved to repository?
class RunDirectory:
       
   def get_run_table():  # more or less information? RunComment | Number of Files | start & stop time ? 
       ...
   
   def get_run(daq, run_number):  # daq is not needed!

       ...
   
   
class Run:  # constructor optional without RunDirectory or use there self.path
       
   def get_files():
       ...
   
   def get_channels():  # of file #1
       ...

   def get_start_time(): # better as attribute?
       ...
   
   def get_stop_time():  # which?  |  better as attribute?
       ...

   def to_df(daq_map):  # to_df(daq_map, slice) slice=[0:4] -> throw Exception
       ...
   
   def to_series(channel):
       ...
   
   def to_array(channel):
       ...
ideas
run.to_df(daq_map)
run.to_series(daq_adr or daq_map) # on channel only?
run.to_array(daq_adr) # on channel only? 

## interesting?
# run.to_dask(daq_map)
# run.to_xarray(daq_map)