如何解决关于缓存性能的一些问题计算机架构
有关 X5650 处理器的详细信息,请访问 https://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20X5650%20-%20AT80614004320AD%20(BX80614X5650).html
重要说明: L3缓存大小:12288KB 缓存线大小:64
考虑以下两个函数,它们都将数组中的值增加 100。
void incrementVector1(INT4* v,int n) {
for (int k = 0; k < 100; ++k) {
for (int i = 0; i < n; ++i) {
v[i] = v[i] + 1;
} }
}
void incrementVector2(INT4* v,int n) {
for (int i = 0; i < n; ++i) {
for (int k = 0; k < 100; ++k) {
v[i] = v[i] + 1;
} }
}
The following data collected using the perf utility captures runtime information for executing
each of these functions on the Intel Xeon X5650 processor for various data sizes. In this data: • the program vector1.bin executes the function incrementVector1;
• the program vector2.bin executes the function incrementVector2;
• the programs take a command line argument which sets the value of n;
• both programs begin by allocating an array of size n and initializing all elements to 0. • LLC-loads means “last level cache loads”,the number of accesses to L3;
• LLC-load-misses means “last level cache misses”,the number of L3 cache misses.
Runtime performance of vector1.bin.
Performance counter stats for ’./vector1.bin 1000000’:
230,070 LLC-loads
3,280 LLC-load-misses # 1.43% of all LL-cache references
0.383542737 seconds time elapsed
Performance counter stats for ’./vector1.bin 3000000’:
669,884 LLC-loads
242,876 LLC-load-misses # 36.26% of all LL-cache references
1.156663301 seconds time elapsed
Performance counter stats for ’./vector1.bin 5000000’:
1,234,031 LLC-loads
898,577 LLC-load-misses # 72.82% of all LL-cache references
1.941832434 seconds time elapsed
Performance counter stats for ’./vector1.bin 7000000’:
1,620,026 LLC-loads
1,142,275 LLC-load-misses # 70.51% of all LL-cache references
2.621428714 seconds time elapsed
Performance counter stats for ’./vector1.bin 9000000’:
2,068,028 LLC-loads
1,422,269 LLC-load-misses # 68.77% of all LL-cache references
3.308037628 seconds time elapsed
8
Runtime performance of vector2.bin.
Performance counter stats for ’./vector2.bin 1000000’:
16,464 LLC-loads
1,168 LLC-load-misses # 7.049% of all LL-cache references
0.319311959 seconds time elapsed
Performance counter stats for ’./vector2.bin 3000000’:
42,052 LLC-loads
17,027 LLC-load-misses # 40.49% of all LL-cache references
0.954854798 seconds time elapsed
Performance counter stats for ’./vector2.bin 5000000’:
63,991 LLC-loads
38,459 LLC-load-misses # 60.10% of all LL-cache references
1.593526338 seconds time elapsed
Performance counter stats for ’./vector2.bin 7000000’:
99,773 LLC-loads
56,481 LLC-load-misses # 56.61% of all LL-cache references
2.198810471 seconds time elapsed
Performance counter stats for ’./vector2.bin 9000000’:
120,456 LLC-loads
76,951 LLC-load-misses # 63.88% of all LL-cache references
2.772653964 seconds time elapsed
我有两个问题:
- 考虑 vector1.bin 的缓存未命中率。在向量大小 1000000 和 5000000 之间,缓存未命中率急剧增加。缓存未命中率增加的原因是什么?
- 考虑两个程序的缓存未命中率。请注意,对于任何特定的数组大小,两个程序之间的未命中率大致相等。这是为什么?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。