<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Flash scratch storage - what happens if job runs out of walltime?</title>
  <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_thread?p_l_id=&amp;threadId=2986038" />
  <subtitle>Flash scratch storage - what happens if job runs out of walltime?</subtitle>
  <entry>
    <title>Flash scratch storage - what happens if job runs out of walltime?</title>
    <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=2986037" />
    <author>
      <name>Carl Lemmon</name>
    </author>
    <id>https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=2986037</id>
    <updated>2022-05-12T21:24:03Z</updated>
    <published>2022-05-12T21:24:03Z</published>
    <summary type="html">Hi all,&lt;br /&gt;&lt;br /&gt;I currently have a trial allocation on expanse and I am trying to get my software (ORCA) configured as best as possible before I run some benchmarks.&lt;br /&gt;&lt;br /&gt;Often I am running restartable jobs, and will run out of walltime and need to resubmit the job. I would like to use the SSD flash scratch storage to hold my running jobs, the storage that is accessed under &amp;#034;/scratch/$USER/job/$SLURM_JOB_ID&amp;#034;. The guide specifies this is only accessible during a job run. Obviously, after the job is done, whether it finishes successfully or fails, it goes to my next command in the slurm script which is to move all of the data back to home directory. But if it runs out of walltime while ORCA is still running, that next command to move the data back to home will not run. I was wondering what I can do to ensure that I am not wasting my SUs of that job by losing all of the data.&lt;br /&gt;&lt;br /&gt;Other clusters have a &amp;#034;orphan&amp;#034; folder where this lost data is sometimes held for a time. Do XSEDE clusters or expanse have an option like this?&lt;br /&gt;&lt;br /&gt;Alternatively, I have used epilog scripts before in TORQUE, and I know SLURM also has the option of epilog scripts. Will an epilog script have access to that scratch folder, and if so can I use an epilog script to copy it back so that it will always copy whether the job runs out of walltime or not?</summary>
    <dc:creator>Carl Lemmon</dc:creator>
    <dc:date>2022-05-12T21:24:03Z</dc:date>
  </entry>
</feed>

