Changes for page 00 - How to login to Maxwell
Last modified by flenners on 2025-06-24 16:56
Summary
-
Page properties (2 modified, 0 added, 0 removed)
-
Attachments (0 modified, 0 added, 8 removed)
-
Objects (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,1 +1,1 @@ 1 -0 0- How to login to Maxwell1 +02 - How to login to Maxwell and start RecoGUI - Content
-
... ... @@ -1,214 +1,159 @@ 1 - TheDESY has a quite powerful compute cluster calledthe Maxwell cluster. The documentation can be found here [[https:~~/~~/confluence.desy.de/display/MXW/Maxwell+Cluster>>doc:MXW.MaxwellCluster.WebHome||shape="rect"]], howeverasthis can be confusing sometimes, we will trytocondensate this toastep by step manual.1 += {{id name="00-HowtologintoMaxwell-ShortVersion:"/}}**Short Version: ** = 2 2 3 +Terminal: 3 3 5 +(% class="code" %) 6 +((( 7 +salloc ~-~-partition=psx ~-~-nodes=1 –-time=06:00:00 8 +\\ssh max-bla123 9 +\\module load anaconda 10 +\\source activate ~~/envs/tomopy 11 +\\spyder& 12 +))) 4 4 5 - {{toc/}}14 +\\ 6 6 7 - = {{id name="00-HowtologintoMaxwell-GettingaDESYAccount"/}}Getting a DESY Account =16 +\\ 8 8 9 -During you beamtime you will encounter multiple systems, where you will need two different types of accounts: 18 +{{code linenumbers="true" collapse="true"}} 19 +salloc --partition=psx --nodes=1 –-time=06:00:00 10 10 11 - =={{id name="00-HowtologintoMaxwell-TheDOORAccount"/}}The DOOR Account ==21 +ssh max-bla123 12 12 13 - Before you arrive you have to create a DOOR account anddo all the safety trainings. This account is also beingused for the gamma-portal, where you can manage you beamtimedata,grantaccess toother users andmanage FTP access. However this account does not work with other resources. For this you will have to request a second account:23 +module load anaconda 14 14 15 - =={{id name="00-HowtologintoMaxwell-ThePSXAccount"/}}ThePSX Account==25 +source activate ~/envs/tomopy 16 16 17 -If you decide during a beamtime, you want to have access to the cluster, tell your local contact so, and they will request a PSX account for you. With this you will get access to the Kerberos, Windows and afs resources at DESY, which includes the cluster. 18 - 19 - 20 -After you got the account, you have to change the initial password within 6 days. For this, go to [[https:~~/~~/passwd.desy.de/>>url:https://passwd.desy.de/||shape="rect"]] and log in with your user name and initial password (you do not need any OTP when you sign in for the first time). Then agree to the terms and change your password. 21 - 22 -= {{id name="00-HowtologintoMaxwell-UsingtheCluster"/}}Using the Cluster = 23 - 24 -== {{id name="00-HowtologintoMaxwell-StructureoftheCluster"/}}Structure of the Cluster == 25 - 26 -=== {{id name="00-HowtologintoMaxwell-Overview"/}}Overview === 27 - 28 -The Maxwell Cluster has (status 2021) more than 750 nodes in it. To organize this, you cannot access any node directly, but you have to request compute resources at first. You then can connect form an entrance node to you compute node 29 - 30 -=== {{id name="00-HowtologintoMaxwell-EntranceNodes"/}}Entrance Nodes === 31 - 32 -If you have successfully obtained an PSX account you can get started. The entrance node are: 33 - 34 - 35 -[[https:~~/~~/max-display.desy.de:3389/auth/ssh>>url:https://max-display.desy.de:3443/auth/ssh||shape="rect"]] (in any case) 36 - 37 -These nodes are **not **for processing, as you will share them with many other users. So please do not do anything computational intensive on them, like reconstruction or visualization. Viewing images is ok. 38 - 39 -=== {{id name="00-HowtologintoMaxwell-FastX2"/}}Fast X3 === 40 - 41 -The cluster uses the software FastX3 for connection and virtual desktop. To get the right version of this, use the web interface, log in, and in the bottom right corner is a download link for the desktop client. The version has to match exactly to work properly. 42 - 43 -If you want to add a connection in the desktop client, click the plus, select web, use the address above (including the port), and your username and force ssh authentication. Then you can choose if you want a virtual desktop (XFCE) or a terminal. 44 - 45 -=== {{id name="00-HowtologintoMaxwell-Partitions"/}}Partitions === 46 - 47 -Starting from an entrance node, you can connect to a compute node. As there are multiple levels of priorities etc. the nodes are organizes in partitions. You can only access some of these. To view which one, open a terminal and use the commad: 48 - 49 -{{code}} 50 -my-partitions 27 +spyder& 51 51 {{/code}} 52 52 53 - Your result will look something like this:30 +\\ 54 54 55 - [[image:attach:P5I.User Guide\: NanoCT.4\. Reconstruction Guide.00 - How to login to Maxwell.WebHome@image2021-5-4_10-28-14.png||queryString="version=1&modificationDate=1620116894626&api=v2"]]32 +Spyder: 56 56 57 - == {{id name="00-HowtologintoMaxwell-SLURM"/}}SLURM==34 +Open RecoGUI, 58 58 59 - Theaccessto theresourcesoftheclusterismanagedvia a scheduler, SLURM.36 +(Right click on tab: "Set console working directory") (to be removed) 60 60 61 - SLURM schedulestheaccesstonodesand canrevokesaccess if higherpriority jobs come.38 +Green Arrow to start program 62 62 63 - === {{id name="00-HowtologintoMaxwell-PSXPartition"/}}PSX Partition ===40 +\\ 64 64 65 - Hereyou cannot be kicked outof your allocation. However,only few nodes are in thispartitionand you canalsoonly allocate few in parallel (2021:5).Some of them have GPUs available.42 += {{id name="00-HowtologintoMaxwell-LongVersion:"/}}**Long Version: ** = 66 66 67 - === {{id name="00-HowtologintoMaxwell-AllPartition"/}}All Partition ===44 +\\ 68 68 69 - Very large numberof nodes available andyoucanallocate many in parallel (2021:100).Howevereach allocationcanberevokedwithoutawarning if s.o.with higher priority comes.This is very commontohappen.Ifyou want tousethispartition, besure to design your jobaccordingly. Only CPU nodes.46 +**Login to max-nova**: E.g. from browser [[https:~~/~~/max-nova.desy.de:3443/>>url:https://max-nova.desy.de:3443/auth/ssh||shape="rect"]] 70 70 71 - === {{id name="00-HowtologintoMaxwell-AllgpuPartition"/}}Allgpu Partition ===48 +\\ 72 72 73 - Likeall,but withGPUs50 +Click on "**Launch Session**" and "**XFCE**" Icon 74 74 75 - === {{id name="00-HowtologintoMaxwell-JhubPartition"/}}Jhub Partition===52 +[[image:attach:image2021-4-27_13-55-52.png||height="250"]] 76 76 77 - For Jupyter Hub54 +\\ 78 78 56 +**Open a Terminal**, e.g. from the icon at the bottom of your desktop. You can also open it via right click → "Open Terminal here" directly on your desktop or from any folder. 79 79 80 - == {{id name="00-HowtologintoMaxwell-ConnectingtotheCluster"/}}Connectingto the Cluster==58 +[[image:attach:image2021-4-27_13-58-35.png||height="250"]] 81 81 82 - Connect to an entrance node via FastX. You will automatically be assigned to a node when you start a session via a load balancer (max-display001-003, max-nova001-002)60 +\\ 83 83 84 - [[image:attach:P5I.UserGuide\:NanoCT.4\.Reconstruction Guide.00-Howto logintoMaxwell.WebHome@image2021-4-27_13-55-52.png||queryString="version=1&modificationDate=1619524552546&api=v2"]]62 +Now you can **allocate a node** for yourself, so you will have enough memory and power for your reconstruction. 85 85 86 -Choose a graphic interface and look around. 64 +(% class="code" %) 65 +((( 66 +salloc ~-~-partition=psx ~-~-nodes=1 ~-~-time=06:00:00 67 +))) 87 87 69 +\\ 88 88 89 - =={{idname="00-HowtologintoMaxwell-DataStorage"/}}DataStorage==71 +You will get a node for 6 hours, you can also choose longer or shorter times. 90 90 91 - TheMaxwellclusterknowsmanystoragesystems.Themostimportantare:73 +It can take some time before you get a node, then it will tell you which node is reserved for you. (Example: max-exfl069) 92 92 93 - Your User Folder: This has a hard limit of 30 GB. Be sure not to exceed this.75 +\\ 94 94 95 - TheGPFS:hereallthebeamtimedata arestored.77 +Now you can **login via ssh** on this node: 96 96 97 -=== {{id name="00-HowtologintoMaxwell-GPFS"/}}GPFS === 79 +(% class="code" %) 80 +((( 81 +ssh max-exfl069 82 +))) 98 98 99 - Usually you canfindyoudata at: /asap3/petra3/gpfs/<beamline>/<year>/data/<beamtime_id>84 +Enter your password. 100 100 101 - In there you will find a substructure:86 +\\ 102 102 103 -* raw: raw measurement data. Only applicant and beamtime leader can write/delete there 104 -* processed: for all processed data 105 -* scratch_cc: scratch folder w/o backup 106 -* shared: for everything else 88 +EXAMPLE: 107 107 108 - The GPFS has regular snapshots. Thewhole capacity ofthis is huge(several PB)90 +[[image:attach:image2021-4-27_13-52-11.png||height="125"]] 109 109 110 - == {{idname="00-HowtologintoMaxwell-HowtoGetaComputeNode"/}}HowtoGetaComputeNode==92 +Hint: Please use partition=psx, if you use =all, the connection might close while you are working if someone with higher priority needs the node you are working on. 111 111 112 - If you want to do some processing, there are two ways to start a job in SLURM:94 +\\ 113 113 114 -1. Interactive 115 -1. Batch 96 +Now you are on a different node [[image:http://confluence.desy.de/s/de_DE/7901/4635873c8e185dc5df37b4e2487dfbef570b5e2c/_/images/icons/emoticons/smile.svg||title="(Lächeln)" border="0" class="emoticon emoticon-smile"]]. 116 116 117 - In both cases you are the only person working on the node, so use it as much as you like.98 +\\ 118 118 119 - ==={{idname="00-HowtologintoMaxwell-StartinganInteractiveJob"/}}StartinganInteractiveJob ===100 +You first have to **load the anaconda module:** 120 120 121 -To get a node you have to allocate one via SLURM e.g. use: 102 +(% class="code" %) 103 +((( 104 +module load anaconda/3 105 +\\ 106 +))) 122 122 123 -{{code}} 124 -salloc -N 1 -p psx -t 1-05:00:00 125 -{{/code}} 108 +and **activate your virtual environment**, depending on where you installed it: 126 126 127 -Looking at the individual options: 128 - 129 -* salloc: specifies you want a live allocation 130 -* -N 1: for one node 131 -* -p psx: on the psx partition. You can also add multiple separated with a comma: -p psx,all 132 -* -t 1-05:00:00: for the duration of 1 day and 5h 133 -* ((( 134 -Other options could be: ~-~-mem=500GB with at least 500GB of memory, 135 - 136 136 (% class="code" %) 137 137 ((( 138 - if youneedgpu: (% class="bash plain" %){{codelanguage="none"}}--constraint=P100{{/code}}112 +source activate ~~/envs/tomopy 139 139 ))) 140 -))) 141 -* ... see the SLURM documentation for more options 142 142 143 - If your job is scheduled you see your assigned node and can connect via ssh to it. (in the rare case where you do not see anything use my-jobs to find out the host name).115 +\\ 144 144 145 - ==={{id name="00-HowtologintoMaxwell-Startingabatchjob"/}}Startinga batchjob===117 +~~/ takes you back to your home directory. In this case, the environment "tomopy" was installed in the home directory in the folder "envs". 146 146 147 - For a batch job you need a small shell script describing what you want to do. You do not see the job directly, but the output is written to a log file (and results can be stored on disk)119 +\\ 148 148 149 - With a batch job,you canalso startan array job, where thesame task is executed on multipleservers in parallel.121 +now you can **start spyder**: 150 150 151 -An example for such a script: 123 +(% class="code" %) 124 +((( 125 +spyder& 126 +))) 152 152 153 -{{code}} 154 -#!/bin/bash 155 -#SBATCH --time 0-01:00:00 156 -#SBATCH --nodes 1 157 -#SBATCH --partition all,ps 158 -#SBATCH --array 1-80 159 -#SBATCH --mem 250GB 160 -#SBATCH --job-name ExampleScript 128 +\\ 161 161 130 +EXAMPLE: (virtual environment in "envs/p36" 162 162 163 -source /etc/profile.d/modules.sh 164 -echo "SLURM_JOB_ID $SLURM_JOB_ID" 165 -echo "SLURM_ARRAY_JOB_ID $SLURM_ARRAY_JOB_ID" 166 -echo "SLURM_ARRAY_TASK_ID $SLURM_ARRAY_TASK_ID" 167 -echo "SLURM_ARRAY_TASK_COUNT $SLURM_ARRAY_TASK_COUNT" 168 -echo "SLURM_ARRAY_TASK_MAX $SLURM_ARRAY_TASK_MAX" 169 -echo "SLURM_ARRAY_TASK_MIN $SLURM_ARRAY_TASK_MIN" 132 +[[image:attach:image2021-4-27_13-53-35.png||height="71"]] 170 170 171 - module load maxwell gcc/8.2134 +\\ 172 172 173 - .local/bin/ipython3--pylab=qt5PathToYourScript/Script.py$SLURM_ARRAY_TASK_ID136 +You can also start another terminal e.g. if you want to look at your data / reconstructions in fiji. 174 174 175 - exit138 +\\ 176 176 140 +Hint: You can check your partition via 177 177 178 -{{/code}} 142 +(% class="code" %) 143 +((( 144 +my-partitions 145 +))) 179 179 147 +You should have access to psx. 180 180 181 - To runthis use149 +[[image:attach:image2021-5-4_10-28-14.png]] 182 182 183 -{{code}} 184 -sbatch ./your_script.sh 185 -{{/code}} 151 +\\ 186 186 153 +For further information also check: 187 187 188 - === {{idname="00-HowtologintoMaxwell-Viewingyouallocations"/}}Viewing you allocations ===155 +[[doc:IS.Maxwell.WebHome]] 189 189 190 - To view your pending or running allocations you can use:157 +\\ 191 191 192 -{{code}} 193 -squeue -u <username> 194 - 195 -or 196 - 197 -my-jobs 198 -{{/code}} 199 - 200 - 201 -=== {{id name="00-HowtologintoMaxwell-Whatisrealisticintermsofresources"/}}What is realistic in terms of resources === 202 - 203 -To be fair, you will not get 100 nodes every time you want them. Especially during a user run, the machines are often quite busy. But if you design your scripts to be tolerant to sudden cancellation, it is still worth trying if you profit from massive parallelization. 204 - 205 -If you want to do some small processing, use one of the psx nodes. This should work most of the time. 206 - 207 - 208 -== {{id name="00-HowtologintoMaxwell-GrantingDataAccesstootherBeamtimes"/}}Granting Data Access to other Beamtimes == 209 - 210 -If you have to add other users to a past beamtime, this can be done via the gamma-portal (by PI, leader or beamline scientist). After adding the accounts, these people have to make sure to log off from **all **FastX sessions, etc. to update the permissions. 211 - 212 - 213 - 214 - 159 +\\
- image2021-4-27_13-51-12.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -349.0 KB - Content
- image2021-4-27_13-51-35.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -349.0 KB - Content
- image2021-4-27_13-52-11.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -47.1 KB - Content
- image2021-4-27_13-53-35.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -24.7 KB - Content
- image2021-4-27_13-55-52.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -99.9 KB - Content
- image2021-4-27_13-58-35.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -285.1 KB - Content
- image2021-5-4_10-27-13.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -278.7 KB - Content
- image2021-5-4_10-28-14.png
-
- Author
-
... ... @@ -1,1 +1,0 @@ 1 -XWiki.flenners - Size
-
... ... @@ -1,1 +1,0 @@ 1 -278.8 KB - Content
- Confluence.Code.ConfluencePageClass[0]
-
- Id
-
... ... @@ -1,1 +1,1 @@ 1 -20 49414971 +230766014 - Title
-
... ... @@ -1,1 +1,1 @@ 1 -0 0- How to login to Maxwell1 +02 - How to login to Maxwell and start RecoGUI - URL
-
... ... @@ -1,1 +1,1 @@ 1 -https://confluence.desy.de/spaces/P5I/pages/20 4941497/00- How to login to Maxwell1 +https://confluence.desy.de/spaces/P5I/pages/230766014/02 - How to login to Maxwell and start RecoGUI