<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Jobs running abnormally slow on Comet</title>
  <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_recent_posts?p_l_id=" />
  <subtitle>Jobs running abnormally slow on Comet</subtitle>
  <entry>
    <title>Jobs running abnormally slow on Comet</title>
    <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=1585876" />
    <author>
      <name>Joseph Andrew Barranco</name>
    </author>
    <id>https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=1585876</id>
    <updated>2017-06-25T18:40:07Z</updated>
    <published>2017-06-25T18:38:47Z</published>
    <summary type="html">Recently, some jobs on Comet seem to be running a factor of 4 slower than usual (that is, they run slow from the start and throughout the entire calculation).  I have cancelled such jobs and and started them over, and then they run at the expected speed.  This has only occurred since I started running on 32 nodes (768 cores).  What could cause the exact same code to behave like this?&lt;br /&gt;&lt;br /&gt;I am going to start to keep track of which nodes are being used...  Could it be a problem with one faulty slow node?  My code runs at the pace of the slowest node.&lt;br /&gt;&lt;br /&gt;Any advice on how I might diagnose this?&lt;br /&gt;&lt;br /&gt;For the record, this code has run successfully on Stampede (TACC) and Pleiades (NASA) and not had this problem.</summary>
    <dc:creator>Joseph Andrew Barranco</dc:creator>
    <dc:date>2017-06-25T18:38:47Z</dc:date>
  </entry>
</feed>

