<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Offload bug  in OpenMP reduction in C main code</title>
  <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_thread?p_l_id=&amp;threadId=441380" />
  <subtitle>Offload bug  in OpenMP reduction in C main code</subtitle>
  <entry>
    <title>Offload bug  in OpenMP reduction in C main code</title>
    <link rel="alternate" href="https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=441379" />
    <author>
      <name>Kent Milfeld</name>
    </author>
    <id>https://conferences.xsede.org/c/message_boards/find_message?p_l_id=&amp;messageId=441379</id>
    <updated>2013-01-18T13:33:39Z</updated>
    <published>2013-01-18T13:29:27Z</published>
    <summary type="html">Compiler Bug:&lt;br /&gt;The following C code which uses an OpenMP reduction in an offloaded code block of the main program gives correct results when fewer than 9 threads are used, but incorrect results for 9 threads and above.  This occurs in the 13.0.1.117 Intel compiler, but not in the previous revision (13.0.1.079). This only occurs when the code block is in the main routine (and the reduction is offloaded).  The error does not occur when the block is executed natively, or in functions called within an offload. This does not occur in similar Fortran code.  The program below is a sanitized version of the code in the &amp;#034;issue&amp;#034; (bug) reported to Intel.  The bug is fixed in the next release, due within 3 weeks of this posting. The compiler command, environment and execution command are also given.  If you have any concerns about similar code in your program, please submit a ticket.&lt;br /&gt;&lt;br /&gt;Work around:&lt;br /&gt;Turn off fast reductions with the compiler option: -offload-option,mic,compiler,&amp;#034;-mP2OPT_hpo_fast_reduction=FALSE&amp;#034;.&lt;br /&gt;&lt;br /&gt;   icc -offload-option,mic,compiler,&amp;#034;-mP2OPT_hpo_fast_reduction=FALSE&amp;#034; bug2.c&lt;br /&gt;   export MIC_OMP_NUM_THREADS=9&lt;br /&gt;   export MIC_PREFIX=MIC&lt;br /&gt;  ./a.out&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;bug2.c code:&lt;br /&gt;&lt;br /&gt;&lt;div class="code"&gt;&lt;span class="code-lines"&gt;&amp;nbsp;1&lt;/span&gt;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;2&lt;/span&gt;#include &amp;lt;omp.h&amp;gt;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;3&lt;/span&gt;int main()&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;4&lt;/span&gt;{&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;5&lt;/span&gt;&amp;nbsp; double sum; int i,n, nt;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;6&lt;/span&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;7&lt;/span&gt;&amp;nbsp; n=2000;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;8&lt;/span&gt;&amp;nbsp; sum=0.0e0;&lt;br /&gt;&lt;span class="code-lines"&gt;&amp;nbsp;9&lt;/span&gt;&amp;nbsp; &amp;nbsp; &lt;br /&gt;&lt;span class="code-lines"&gt;10&lt;/span&gt;#pragma offload target(mic:0) inout(sum)&lt;br /&gt;&lt;span class="code-lines"&gt;11&lt;/span&gt;&amp;nbsp; {&lt;br /&gt;&lt;span class="code-lines"&gt;12&lt;/span&gt;#pragma omp parallel for reduction(+:sum)&lt;br /&gt;&lt;span class="code-lines"&gt;13&lt;/span&gt;&amp;nbsp; &amp;nbsp; for(i=1;i&amp;lt;=n;i++)&lt;br /&gt;&lt;span class="code-lines"&gt;14&lt;/span&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;{&lt;br /&gt;&lt;span class="code-lines"&gt;15&lt;/span&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp; sum += (double)i;&lt;br /&gt;&lt;span class="code-lines"&gt;16&lt;/span&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp; &amp;nbsp;}&lt;br /&gt;&lt;span class="code-lines"&gt;17&lt;/span&gt;&amp;nbsp; &amp;nbsp; nt = omp_get_max_threads();&lt;br /&gt;&lt;span class="code-lines"&gt;18&lt;/span&gt;&amp;nbsp; &amp;nbsp; printf(&amp;#034;Hello MIC reduction %f threads: %d\n&amp;#034;,sum,nt);&lt;br /&gt;&lt;span class="code-lines"&gt;19&lt;/span&gt;&amp;nbsp; }&lt;br /&gt;&lt;span class="code-lines"&gt;20&lt;/span&gt;}&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;icc bug1.c&lt;br /&gt;export MIC_OMP_NUM_THREADS=9&lt;br /&gt;export MIC_PREFIX=MIC&lt;br /&gt;./a.out  #gives incorrect results</summary>
    <dc:creator>Kent Milfeld</dc:creator>
    <dc:date>2013-01-18T13:29:27Z</dc:date>
  </entry>
</feed>

