Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable high accuracy timers #285

Merged
merged 12 commits into from
Nov 25, 2014
4 changes: 3 additions & 1 deletion ompi/mpi/c/wtick.c
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,9 @@ double MPI_Wtick(void)
{
OPAL_CR_NOOP_PROGRESS();

#if OPAL_TIMER_USEC_NATIVE
#if OPAL_TIMER_CYCLE_NATIVE
return opal_timer_base_get_freq();
#elif OPAL_TIMER_USEC_NATIVE
return 0.000001;
#else
/* Otherwise, we already return usec precision. */
Expand Down
4 changes: 3 additions & 1 deletion ompi/mpi/c/wtime.c
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,9 @@ double MPI_Wtime(void)
{
double wtime;

#if OPAL_TIMER_USEC_NATIVE
#if OPAL_TIMER_CYCLE_NATIVE
wtime = ((double) opal_timer_base_get_cycles()) / opal_timer_base_get_freq();
#elif OPAL_TIMER_USEC_NATIVE
wtime = ((double) opal_timer_base_get_usec()) / 1000000.0;
#else
/* Fall back to gettimeofday() if we have nothing else */
Expand Down
37 changes: 20 additions & 17 deletions opal/include/opal/sys/amd64/timer.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand All @@ -25,29 +25,32 @@ typedef uint64_t opal_timer_t;

#if OPAL_GCC_INLINE_ASSEMBLY

#if 0
/**
* http://www.intel.com/content/www/us/en/intelligent-systems/embedded-systems-training/ia-32-ia-64-benchmark-code-execution-paper.html
*/
static inline opal_timer_t
opal_sys_timer_get_cycles(void)
{
opal_timer_t ret;

__asm__ __volatile__("rdtsc" : "=A"(ret));

return ret;
}

unsigned a, d;
#if 0
__asm__ __volatile__ ("cpuid\n\t"
"rdtsc\n\t"
: "=a" (a), "=d" (d)
:: "%rax", "%rbx", "%rcx", "%rdx");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to drop the % characters in the clobber list (two places in this file, once in ia32/timer.h). Untested, but that's my recollection of the syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax seems to be correct. The problem is that I over-specified the
clobber list (rax and rdx) were already implied by the "=a" and "=d" one
line above.

George.

On Mon, Nov 24, 2014 at 3:29 PM, Dave Goodell notifications@github.com
wrote:

In opal/include/opal/sys/amd64/timer.h:

static inline opal_timer_t
opal_sys_timer_get_cycles(void)
{

- opal_timer_t ret;

- asm volatile("rdtsc" : "=A"(ret));

  • return ret;
    -}
  • unsigned a, d;
    
    +#if 0
  • **asm** **volatile** ("cpuid\n\t"
    
  •                       "rdtsc\n\t"
    
  •                       : "=a" (a), "=d" (d)
    
  •                       :: "%rax", "%rbx", "%rcx", "%rdx");
    

I think you want to drop the % characters in the clobber list (two places
in this file, once in ia32/timer.h). Untested, but that's my recollection
of the syntax.


Reply to this email directly or view it on GitHub
https://github.com/open-mpi/ompi/pull/285/files#r20823713.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed all your concerns/suggestions. Thanks, they really make the
code cleaner and hopefully more robust. This is now ready to be merged.

On Mon, Nov 24, 2014 at 3:35 PM, George Bosilca bosilca@icl.utk.edu wrote:

The syntax seems to be correct. The problem is that I over-specified the
clobber list (rax and rdx) were already implied by the "=a" and "=d" one
line above.

George.

On Mon, Nov 24, 2014 at 3:29 PM, Dave Goodell notifications@github.com
wrote:

In opal/include/opal/sys/amd64/timer.h:

static inline opal_timer_t
opal_sys_timer_get_cycles(void)
{

- opal_timer_t ret;

- asm volatile("rdtsc" : "=A"(ret));

  • return ret;
    -}
  • unsigned a, d;
    
    +#if 0
  • **asm** **volatile** ("cpuid\n\t"
    
  •                       "rdtsc\n\t"
    
  •                       : "=a" (a), "=d" (d)
    
  •                       :: "%rax", "%rbx", "%rcx", "%rdx");
    

I think you want to drop the % characters in the clobber list (two
places in this file, once in ia32/timer.h). Untested, but that's my
recollection of the syntax.


Reply to this email directly or view it on GitHub
https://github.com/open-mpi/ompi/pull/285/files#r20823713.

#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code (in the context) that was in this #else block is buggy. rdtsc must be used with a serializing cpuid instruction to ensure that you are timing what you think you are timing. There's an even better pattern that should be used in general, though I'm not sure we can apply it to MPI_Wtime() because we don't know if we are being called at the beginning of a timing region or at the end. For more information see http://www.intel.com/content/www/us/en/intelligent-systems/embedded-systems-training/ia-32-ia-64-benchmark-code-execution-paper.html


static inline opal_timer_t
opal_sys_timer_get_cycles(void)
{
unsigned a, d;
__asm__ __volatile__ ("rdtsc" : "=a" (a), "=d" (d));
/* If we need higher accuracy we should implement the algorithm proposed
* on the Intel document referenced above. However, in the context of MPI
* this function will be used as the backend for MPI_Wtime and as such
* can afford a small inaccuracy.
*/
__asm__ __volatile__ ("rdtscp\n\t"
"cpuid"
: "=a" (a), "=d" (d)
:: "%rax", "%rbx", "%rcx", "%rdx");
#endif
return ((opal_timer_t)a) | (((opal_timer_t)d) << 32);
}

#endif

#define OPAL_HAVE_SYS_TIMER_GET_CYCLES 1

#else
Expand Down
7 changes: 5 additions & 2 deletions opal/include/opal/sys/ia32/timer.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -30,7 +30,10 @@ opal_sys_timer_get_cycles(void)
{
opal_timer_t ret;

__asm__ __volatile__("rdtsc" : "=A"(ret));
__asm__ __volatile__("cpuid\n"
"rdtsc\n"
: "=A"(ret)
:: "%eax", "%ebx", "%ecx", "%edx");

return ret;
}
Expand Down
4 changes: 2 additions & 2 deletions opal/mca/timer/aix/timer_aix.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2006 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -39,7 +39,7 @@ opal_timer_base_get_usec()
retval = (t.tb_high * 1000000) + t.tb_low / 1000;

return retval;
}
}

static inline opal_timer_t
opal_timer_base_get_cycles()
Expand Down
4 changes: 2 additions & 2 deletions opal/mca/timer/altix/timer_altix.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -36,7 +36,7 @@ static inline opal_timer_t
opal_timer_base_get_usec(void)
{
return opal_timer_base_get_cycles() / opal_timer_altix_usec_conv;
}
}


static inline opal_timer_t
Expand Down
15 changes: 9 additions & 6 deletions opal/mca/timer/darwin/timer_darwin.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand All @@ -26,15 +26,18 @@ typedef uint64_t opal_timer_t;

/* frequency in mhz */
OPAL_DECLSPEC extern opal_timer_t opal_timer_darwin_freq;

OPAL_DECLSPEC extern mach_timebase_info_data_t opal_timer_darwin_info;

static inline opal_timer_t
opal_timer_base_get_cycles(void)
{
if( opal_timer_darwin_info.denom == 0 ) {
(void) mach_timebase_info(&opal_timer_darwin_info);
}
/* this is basically a wrapper around the "right" assembly to get
the tick counter off the PowerPC Time Base. I believe it's
something similar on x86 */
return mach_absolute_time();
return mach_absolute_time() * opal_timer_darwin_info.numer / opal_timer_darwin_info.denom / 1000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Watch out for integer overflow here. I think you want to convert the fraction to a double and then multiply the absolute time by that. See this detailed Stack Overflow answer: http://stackoverflow.com/a/23378064/158513

}


Expand All @@ -43,7 +46,7 @@ opal_timer_base_get_usec(void)
{
/* freq is in Hz, so this gives usec */
return mach_absolute_time() * 1000000 / opal_timer_darwin_freq;
}
}


static inline opal_timer_t
Expand All @@ -53,9 +56,9 @@ opal_timer_base_get_freq(void)
}


#define OPAL_TIMER_CYCLE_NATIVE 1
#define OPAL_TIMER_CYCLE_NATIVE 0
#define OPAL_TIMER_CYCLE_SUPPORTED 1
#define OPAL_TIMER_USEC_NATIVE 0
#define OPAL_TIMER_USEC_NATIVE 1
#define OPAL_TIMER_USEC_SUPPORTED 1

#endif
11 changes: 2 additions & 9 deletions opal/mca/timer/darwin/timer_darwin_component.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2007 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand All @@ -20,17 +20,15 @@

#include "opal_config.h"

#include <mach/mach_time.h>

#include "opal/mca/timer/timer.h"
#include "opal/mca/timer/darwin/timer_darwin.h"
#include "opal/constants.h"

opal_timer_t opal_timer_darwin_freq;
mach_timebase_info_data_t opal_timer_darwin_info = {.denom = 0};

static int opal_timer_darwin_open(void);


const opal_timer_base_component_2_0_0_t mca_timer_darwin_component = {
/* First, the mca_component_t struct containing meta information
about the component itself */
Expand Down Expand Up @@ -95,11 +93,6 @@ int opal_timer_darwin_open(void)
nanoseconds, taking the reverse of that and multipling by
1000000000 will give you a frequency in cycles / second if you
think of mach_absolute_time() always returning a cycle count.

By the way, it's interesting to note that because these are
library functions and because of how rosetta works, a PPC
binary running under rosetta on an Intel Mac will behave
exactly like an Intel binary running on an Intel Mac.
*/
opal_timer_darwin_freq = sTBI.denom * (1000000000 / sTBI.numer);

Expand Down
6 changes: 3 additions & 3 deletions opal/mca/timer/linux/timer_linux.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -44,7 +44,7 @@ opal_timer_base_get_usec(void)
#else
return 0;
#endif
}
}


static inline opal_timer_t
Expand All @@ -56,7 +56,7 @@ opal_timer_base_get_freq(void)

#define OPAL_TIMER_CYCLE_NATIVE OPAL_HAVE_SYS_TIMER_GET_CYCLES
#define OPAL_TIMER_CYCLE_SUPPORTED OPAL_HAVE_SYS_TIMER_GET_CYCLES
#define OPAL_TIMER_USEC_NATIVE 0
#define OPAL_TIMER_USEC_NATIVE OPAL_HAVE_SYS_TIMER_GET_CYCLES
#define OPAL_TIMER_USEC_SUPPORTED OPAL_HAVE_SYS_TIMER_GET_CYCLES

#endif
4 changes: 2 additions & 2 deletions opal/mca/timer/solaris/timer_solaris.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* Copyright (c) 2004-2014 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
Expand Down Expand Up @@ -35,7 +35,7 @@ opal_timer_base_get_usec(void)
{
/* gethrtime returns nanoseconds */
return gethrtime() / 1000;
}
}

static inline opal_timer_t
opal_timer_base_get_freq(void)
Expand Down