为什么独立代码块的执行时间取决于Scala中的执行顺序？

参见英文答案 > How do I write a correct micro-benchmark in Java? 11个
我有一个用Scala编写的程序.我想测量不同独立代码块的执行时间.当我以明显的方式(即在每个块之前和之后插入System.nanoTime())时,我观察到执行时间取决于块的顺序.前几个块总是花费比其他块更多的时间.

我创建了一个简单的例子来重现这种行为.为简单起见,所有代码块都是相同的,并为整数数组调用hashCode().

package experiments

import scala.util.Random

/**
  * Measuring execution time of a code block
  *
  * Minimalistic example
  */
object CodeBlockMeasurement {

  def main(args: Array[String]): Unit = {
    val numRecords = args(0).toInt
    // number of independent measurements
    val iterations = args(1).toInt

    // Changes results a little bit,but not too much
    // val records2 = Array.fill[Int](1)(0)
    // records2.foreach(x => {})

    for (_ <- 1 to iterations) {
      measure(numRecords)
    }
  }

  def measure(numRecords: Int): Unit = {
    // using a new array every time
    val records = Array.fill[Int](numRecords)(new Random().nextInt())
    // block of code to be measured
    def doSomething(): Unit = {
      records.foreach(k => k.hashCode())
    }
    // measure execution time of the code-block
    elapsedtime(doSomething(),"HashCodeExperiment")
  }

  def elapsedtime(block: => Unit,name: String): Unit = {
    val t0 = System.nanoTime()
    val result = block
    val t1 = System.nanoTime()
    // print out elapsed time in milliseconds
    println(s"$name took ${(t1 - t0).todouble / 1000000} ms")
  }
}

运行numRecords = 100000和iterations = 10的程序后,我的控制台如下所示：

HashCodeExperiment took 14.630283 ms
HashCodeExperiment took 7.125693 ms
HashCodeExperiment took 0.368151 ms
HashCodeExperiment took 0.431628 ms
HashCodeExperiment took 0.086455 ms
HashCodeExperiment took 0.056458 ms
HashCodeExperiment took 0.055138 ms
HashCodeExperiment took 0.062997 ms
HashCodeExperiment took 0.063736 ms
HashCodeExperiment took 0.056682 ms

有人可以解释为什么会这样吗？不应该都一样吗？哪个是真正的执行时间？

非常感谢,
彼得

Environment parameters:
OS: ubuntu 14.04 LTS (64 bit)
IDE: IntelliJ IDEA 2016.1.1 (IU-145.597)
Scala: 2.11.7

解决方法

这是Java的 JIT开始.最初执行普通字节码但是经过一段时间(默认情况下为Oracle JVM调用1.5k / 10k,见 -XX:CompileThreshold),优化开始处理实际执行的本机代码,这通常会导致相当大的性能改进.

正如Ivan所提到的,那里有中间字节码/本机代码和各种其他技术的缓存,其中最重要的一个是垃圾收集器本身,它会导致个别结果的更多变化.根据代码分配新对象的程度,这可能绝对会在GC发生时丢弃性能,但这是一个单独的问题.

要在微基准测试时删除此类异常值结果,建议您对操作的多次迭代进行基准测试,并丢弃底部和前5..10％的结果,并根据剩余样本进行性能评估.

为什么独立代码块的执行时间取决于Scala中的执行顺序？

解决方法

相关推荐