Millisecond timing error means that your experiment is not working as you intended and that your results might be invalid.

  • Are you always carrying out the experiments you assume you are?
  • Are you aware of millisecond timing error in your own experiments?
  • Are you confident you can replicate experiments using different hardware and software in another lab?
The key question you should be asking yourself is, "Am I confident in my findings and would I be happy for a researcher in the same field to independently check my experiments?"

Are you putting your reputation at risk?

We can help

If you wish to discuss how any of our products could help you improve your research feel free to email us. Alternatively contact us by mail, phone or fax.

What do I need to know about millisecond timing accuracy

If you are a psychologist, neuroscientist or vision researcher who uses a computer to run experiments, and report timing accuracy in units of a millisecond, then it's likely your timings are wrong! This can lead to replication failure, spurious results and questionable conclusions. Timing error can affect your work even when you use an experiment generator like E-Prime, SuperLab, Inquisit, Presentation, Paradigm, OpenSesame or PsychoPy etc.

Our product's sole aim is to help you improve the quality of your research prior to publication. The Black Box ToolKit v2 for example helps you check your own millisecond timing accuracy in terms of stimulus presentation accuracy; stimulus synchronization accuracy; and response time accuracy and then tune your experiment to deliver better stimulus and response timing. Whereas the mBBTK (event marking version) helps you independently TTL event mark or produce TTL triggers to send to other equipment. Our range of response pads and other devices help you ensure that your response timing is millisecond accurate and consistent.

A summary of what types of millisecond timing error likely to affect your computer-based experiment is shown below:

Idealized experiment shown top, what may happen in reality on your own equipment bottom (click to enlarge)
Put simply, if you are using a computer to run experiments and report timing measures in units of a millisecond then it's likely that your presentation and response timings are wrong! Modern computers and operating systems, whilst running much faster, are not designed to offer the user millisecond accuracy. As a result you may not have conducted the experiment you thought you had!

Hardware is designed to be as cheap as possible to mass produce and to appeal to the widest market. Whilst multitasking operating systems are designed to offer a smooth user experience and look attractive. No doubt you'll have noticed that your new computer and operating system doesn't seem to run the latest version of your word processor any faster than your old system!

Don't commercial experiment generator packages solve all my problems?

Unfortunately using a commercial experiment generator such as, E-Prime, SuperLab, Inquisit and the like will not guarantee you accurate timing as they are designed to run on commodity hardware and operating systems. They all quote millisecond precision, but logically, "millisecond precision" refers to the timing units the software reports in and should not be confused with "millisecond accuracy", i.e. do events occur in the real world with millisecond accuracy.

If you write your own software you will remain just as uncertain as to its timing accuracy. You should also be wary of in-built time audit measures as they can lead to a false sense of security as they are derived by the software itself. For example, if you swap a monitor it is impossible for the software to know anything about a TFT panels timing characteristics, or for that matter about a response device, soundcard or other device you are working with.

It is also impossible to find out which experiment generator offers the most accurate presentation and response timing using generic benchmarks. Often such benchmarks have been conducted using devices such as our BBTK v2, or homemade response hardware, and the experiment generator scripts tuned to give consistent results. The fatal flaw in such an approach is that the authors have tuned the experiment generator to give good results on their own hardware within a very simple script. If you think about it for a moment what this actually shows is that you should be checking and tuning your own experiment on your own hardware with a BBTK v2 to give better results. Results from generic benchmarks cannot possibly apply to your own hardware and experiment as they will be markedly different.

What about switching to Mac/PC/Linux?

It doesn't matter which hardware you work with, PC or Mac, which operating system you use, Microsoft Windows, Apple's OS X or a variety of Linux, you will succumb to timing error. What's more it's getting harder to source the equipment you might have used previously. For example CRT monitors are now virtually impossible to source at a reasonable cost. Input lag can have a huge effect on TFT panels whereas traditional CRTs don't suffer from this effect and can be well over 20x faster when displaying images. What's more each TFT make and model has different timing characteristics for input lag and panel response time. This means you should check each and every TFT you use. If you can see or hear it – you know you have a problem!

Human variability and adding more trials

There has been a long standing argument that human responses are far more variable than the hardware and software itself. In most cases this is only true if the error is truly random, within certain limits and you are not interacting with other external hardware. This can make carrying out replications difficult due to spurious artifacts and conditional biases. In the same way carrying out an unspecified additional number of trials will not lessen the effect of any systematic presentation, synchronization or measurement error.

Aren't humans pretty slow?

The latest research suggests that humans may actually be able to process information much faster than previously thought. For example, Thurgood et al (2011) proposes that humans can identify animals with only 1 millisecond of visual exposure. To her credit to be able to test this her team had to develop their own light-emitting diode (LED) tachistoscope. Put simply off the shelf equipment was simply not fast enough. If differences as small as a millisecond can have an experimental effect this implies that timing errors in a typical study could also have more of an effect than you might think. In the auditory arena a lag of just 10 milliseconds can be reliably detected.

Human error when designing experiments

Human error when creating the experimental scripts themselves is also an unrecognized problem. For example, software commonly used for experimental work has a variety of settings which can affect presentation of both audio and visual stimuli. Often researchers are unsure what impact various settings might have. It is also not unknown for researchers to set incorrect values or introduce bugs into their own code that affects timings. Such errors can be clearly identified and corrected if studies are checked at an early stage.

Do computers lie?

Computers don't, and more to the point can't always do what you tell them and you shouldn't blindly rely on the results they give you. For example you can tell a piece of software used to run experiments to present a priming image for 11 milliseconds whilst playing a tone in the left headphone for 100 milliseconds. You've dialled in the numbers, the computer has accepted them, but the hardware can't possibility do what you've asked due to TFT panel input lag and soundcard start-up latency. The question is does this make your experiment less valid because you are not running the experiment you thought you were? More shockingly different hardware and software has wildly different timing characteristics. So if you reran your study with identical stimulus materials and settings but on different hardware would you be running a different study? Would your results be comparable?

Face and faith validity

In terms of computer-based studies often researchers are prepared to blindly believe what the computer tells them. If the computer reports that a reaction time is 300.14159265 milliseconds because there are quite a few digits after the decimal place on the face of it surely this must be an accurate measure? Well actually no. All it tells us is that the computer is quite precise but not that it has given you an accurate measure. A wall clock can be 10 minutes slow but be accurate to the second. If we knew this would we still say the time we read from its face is accurate? If we didn't know the clock was 10 minutes slow then it would also achieve faith validity. In much the same way we place our faith in computers being accurate when often they are not.

In a nutshell

In a nutshell bad timing will negatively affect the reliability and validity of your experimental work and the results you find. Plus you may also not be able to replicate your own findings over the longer term. The cornerstone of good science is experimental control and replication.

I need to talk specifics

If you would like to discuss the functionality of any of our products or would like to inquire about our consultancy services feel free to contact us. Please note we are unable to give specific advice on timing unless you are one of our products users.

About company

The Black Box Toolkit Ltd was founded in 2003 by a team of psychologists, software experts and electronic engineers.

Dedicated to improving the millisecond timing accuracy and experimental rigor of researchers in the behavioral and brain sciences.

We provide hardware, software and consultancy solutions across a wide range of fields to make this a reality. Print flyer

Get in Touch

  • Phone:
    +44 (0)114 3030056
  • Email:
  • Address:
    The Black Box ToolKit Ltd,
    PO Box 3802, Sheffield,
    S25 9AG, UK