1. °³¿ä
À̹ø ÇÁ·ÎÁ§Æ®¸¦ ÅëÇØ ¼ö¾÷¿¡¼ ¹è¿ü´ø Cache ±¸Á¶¿Í ¼º´ÉÀ» ´Ù¼öÀÇ ½ÇÇèÀ» ÅëÇØ È®ÀÎÇÑ´Ù. ½ÇÇèÀ» À§ÇØ Wisconsin ´ëÇп¡¼ °³¹ßÇÑ Dinero(http://pages.cs.wisc.edu/~markhill/DineroIV/)¶ó´Â Cache Simulation ÅøÀ» »ç¿ëÇÑ´Ù. Dinero Åø¿¡¼ cache size, cache block size, cache associativity¿Í cache replacement policy¸¦ Á¶ÀýÇÒ ¼ö ÀÖ´Â ¿É¼ÇÀ» Á¦°øÇϱ⠶§¹®¿¡ °¢°¢ÀÌ miss rate¿¡ ¹ÌÄ¡´Â ¿µÇâÀ» È®ÀÎÇÒ ¼ö ÀÖ´Ù. ¶ÇÇÑ Dinero IVºÎÅÍ´Â compulsory, capacity, conflict miss ºÐ·ù¸¦ Á¦°øÇϱ⠶§¹®¿¡ À§ ¿©·¯ ¿ä¼ÒµéÀÌ ¾î´À Á¾·ùÀÇ miss¿¡ ¿µÇâÀ» ¹ÌÄ¡´ÂÁöµµ È®ÀÎÇÒ ¼ö ÀÖ´Ù. Instruction°ú data°¡ ºÐ¸®µÈ L1 cache¸¸ ÀÖ´Â °æ¿ì¿Í, ¾ÕÀÇ L1 cache¿¡ Ãß°¡·Î unified L2 cache°¡ ÀÖ´Â °æ¿ì¿¡ ´ëÇØ¼µµ ½ÇÇèÇÏ¿© two level cacheÀÇ È¿°ú¿¡ ´ëÇØ¼µµ È®ÀÎÇÑ´Ù.
2. Simulation Áغñ
½ÇÇè ´ë»óÀ¸·Î »ïÀ» applicationÀº gcc¿Í bzipÀ¸·Î ÇÑ´Ù. °¢ ½ÇÇà ÆÄÀÏÀº SPEC CPU2000 benchmark(http://www.spec.org/cpu2000/)¿¡¼ Á¦°øÇÏ´Â °ÍÀ¸·Î ÇÏ¿´°í, input ÆÄÀÏÀº °¢°¢ benchmarkÀÇ expr.i¿Í input.graphicÀ¸·Î ÇÏ¿´´Ù. Áï,
./cc1 expr.i
./bzip2 input.graphic
ÀÇ Ã³¸®¿¡ ´ëÇØ cache profilingÀ» ÇÏ°Ô µÈ´Ù.
Dinero¿¡¼ cache simulationÀ» Çϱâ À§Çؼ´Â pinatrace inputÀÌ ÇÊ¿äÇÏ´Ù. Pinatrace inputÀº Intel¿¡¼ Á¦°øÇÏ´Â pin tool·Î ¾òÀ» ¼ö ÀÖ´Ù. Pin toolÀ» pinatrace.so library¿Í ÇÔ²² ¾²¸é cc1°ú °°Àº binary¸¦ Dinero¿¡¼ ÀÐÀ» ¼ö ÀÖ´Â ´ÙÀ½°ú °°Àº trace file·Î º¯È¯ÇÏ´Â ¿ªÇÒÀ» ÇÑ´Ù:
2 3912200ab0
2 3912200ab3
1 7FFF8FD702C8
2 39122010c0
1 7FFF8FD702C0
./pin –t pinatrace.so – cc1 expr.i
./pin –t pinatrace.so – bzip2 input.graphic
°ú °°ÀÌ ÇÏ¿©
cc1.pinatrace.out
bzip2.pinatrace.out
À» ¾ò´Â´Ù. Trace fileÀÇ ¿ë·®ÀÌ ¸Å¿ì Å« °ü°è·Î 500MB Á¤µµ µÉ ¶§ ctrl-c·Î trace¸¦ ¼öµ¿ Áß´ÜÇÒ Çʿ䰡 ÀÖ´Ù. ¾ò´Â trace file·Î ³ª¸ÓÁö cache simulationÀ» ÁøÇàÇÒ ¼ö ÀÖÀ¸¸ç, ´Ù½Ã trace¸¦ ÇÒ ÇÊ¿ä´Â ¾ø´Ù.
3. cc1 L1 Cache Simulation
¿ì¼± L1 cache¿¡ ´ëÇØ¼¸¸ Àüü simulation °á°ú¸¦ ¿ä¾àÇÏ¸é ´ÙÀ½°ú °°´Ù:
|
Cache |
Block |
Assoc |
Policy |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8192 |
16 |
4 |
LRU |
0.0212 |
7773 |
4365 |
777 |
2631 |
0.1085 |
14548 |
13024 |
1320 |
204 |
|
16384 |
16 |
4 |
LRU |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
|
32768 |
16 |
4 |
LRU |
0.0124 |
4545 |
4365 |
70 |
110 |
0.1014 |
13591 |
13024 |
510 |
57 |
|
16384 |
8 |
4 |
LRU |
0.0276 |
10099 |
8155 |
597 |
1347 |
0.1821 |
24410 |
23345 |
933 |
132 |
|
16384 |
32 |
4 |
LRU |
0.0088 |
3229 |
2372 |
229 |
628 |
0.0562 |
7537 |
6749 |
632 |
156 |
|
16384 |
16 |
2 |
LRU |
0.0169 |
6195 |
4365 |
375 |
1455 |
0.1072 |
14379 |
13024 |
770 |
585 |
|
16384 |
16 |
8 |
LRU |
0.0132 |
4819 |
4365 |
373 |
81 |
0.1032 |
13843 |
13024 |
782 |
37 |
|
16384 |
16 |
4 |
rdm |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1087 |
14579 |
13024 |
835 |
720 |
ÇϴûöÀ¸·Î Ä¥ÇÑ ºÎºÐÀÌ instruction°ú data cacheÀÇ miss rateÀε¥, instruction cacheÀÇ miss rate´Â ¾à 2%, data cacheÀÇ miss rate´Â ¾à 10% Á¤µµÀÓÀ» ¾Ë ¼ö ÀÖ´Ù. InstructionÀº Ç×»ó loop locality°¡ ÀÖ¾î¼ ³·Àº miss rateÀ» º¸À̰í, cc1Àº compilerÀÌ¾î¼ instruction locality°¡ ³·Àº ÆíÀÏ °ÍÀÌ´Ù. 2%µµ ³·Àº miss rateÀ¸·Î º¸ÀÌÁö¸¸, ³ªÁß¿¡ data compression programÀÎ bzipÀÇ 0.1%¿Í ºñ±³ÇÏ¸é ¹«·Á 20¹è ³ôÀº miss rateÀÓÀ» È®ÀÎÇÒ ¼ö ÀÖ´Ù. ¹Ý¸é compiler¿¡¼ parsing°ú code allocationÀ» ÇÒ ¶§ °°Àº variableÀ» Àç»ç¿ëÇØ¼ data locality°¡ ¾î´À Á¤µµ ÀÖ¾î¼ miss rateÀÌ 10%´ë·Î ³·Àº ¹Ý¸é, ´ë¿ë·® disk data¸¦ ó¸®ÇÏ´Â bzipÀº data miss rate°¡ ÈξÀ ³ôÀº °ÍÀ» È®ÀÎÇÒ ¼ö ÀÖ´Ù.
ÀÌ¾î¼ cache size, block size, associativity, replacement policy °¢°¢ÀÇ ¿µÇâ¿¡ ´ëÇØ Â÷·Ê´ë·Î »ìÆìº¸°Ú´Ù.
1) Cache size
|
Cache Size |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8192 |
0.0212 |
7773 |
4365 |
777 |
2631 |
0.1085 |
14548 |
13024 |
1320 |
204 |
|
16384 |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
|
32768 |
0.0124 |
4545 |
4365 |
70 |
110 |
0.1014 |
13591 |
13024 |
510 |
57 |
Cache size°¡ Ä¿Áü¿¡ µû¶ó instruction°ú data cache µÑ ´Ù miss rate°¡ °¨¼ÒÇÏ´Â °ÍÀ» º¼ ¼ö ÀÖ´Ù. ¶ÇÇÑ 3°³ miss Á¾·ù Áß capacity¿Í conflict miss°¡ ÁÙ¾îµç´Ù. Compulsory miss´Â »õ·Î¿î ÂüÁ¶ ÀÚü·Î ÀÎÇÑ cold missÀ̱⠶§¹®¿¡ ¿¹»ó´ë·Î cache sizeÀÇ Áõ°¡¿¡ ¿µÇâÀ» ¹ÞÁö ¾Ê´Â´Ù. Capacity miss´Â cache ÀÚüÀÇ °ø°£ ºÎÁ·À¸·Î ÀÎÇÑ missÀ̱⠶§¹®¿¡ °¨¼ÒÇϰí, cache size°¡ Áõ°¡ÇÏ¸é ´Ù½Ã »ç¿ëµÇ¾î¾ß ÇÒ blockÀÌ replace µÉ °¡´É¼ºµµ Àû¾îÁö±â ¶§¹®¿¡ conflict missµµ °¨¼ÒÇÑ´Ù.
2) Block size
|
Block Size |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8 |
0.0276 |
10099 |
8155 |
597 |
1347 |
0.1821 |
24410 |
23345 |
933 |
132 |
|
16 |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
|
32 |
0.0088 |
3229 |
2372 |
229 |
628 |
0.0562 |
7537 |
6749 |
632 |
156 |
Block size°¡ Áõ°¡Çϸé ÇϳªÀÇ miss¿¡ ´õ ¸¹Àº data¸¦ cachingÇϱ⠶§¹®¿¡ ¸ðµç miss°¡ °¨¼ÒÇÑ´Ù. ´ë½Å ´õ Å« blockÀ» Àд overhead ¿Ü¿¡, write miss½Ã extra read overhead°¡ ÀÖ´Ù´Â °ÍÀ» °í·ÁÇØ¾ß ÇÑ´Ù. ÀÌ·¯ÇÑ extra overhead¸¦ °í·ÁÇÏÁö ¾ÊÀ¸¸é, Ç×»ó block size¸¦ ´õ Å©°Ô ÇÏ´Â °ÍÀÌ À¯¸®ÇÒ °ÍÀÌ´Ù.
3) Associativity
|
Assoc |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
2 |
0.0169 |
6195 |
4365 |
375 |
1455 |
0.1072 |
14379 |
13024 |
770 |
585 |
|
4 |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
|
8 |
0.0132 |
4819 |
4365 |
373 |
81 |
0.1032 |
13843 |
13024 |
782 |
37 |
AssociativityÀÇ Áõ°¡´Â ¿¹»ó´ë·Î conflict miss¸¦ ÇöÀúÈ÷ °¨¼Ò½ÃŲ´Ù. Fully associativeÀÎ °æ¿ì¿¡´Â conflict miss°¡ 0ÀÏ °ÍÀÌ´Ù. ÇÏÁö¸¸ associativityÀÇ Áõ°¡¿¡ µû¶ó Çѹø¿¡ ºñ±³ÇØ¾ß ÇÏ´Â entry ¼ö°¡ Áõ°¡Çϰí hardware complexity°¡ Áõ°¡ÇϹǷÎ, trade-off¸¦ Àß °í·ÁÇØ¾ß ÇÒ °ÍÀÌ´Ù.
4) Replacement Policy
|
Policy |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
LRU |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
|
random |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1087 |
14579 |
13024 |
835 |
720 |
Replacement policy´Â ÀÌ¹Ì Çѹø cachingµÈ data¸¦ ¾î´À ¼ø¼·Î flushing ÇÏ´ÂÁö¸¦ Á¤Çϱ⠶§¹®¿¡ ÁÖ·Î conflict miss¿¡ ¿µÇâÀ» ¹ÌÄ£´Ù. ¿¹»ó´ë·Î randomÀº LRUº¸´Ù conflict miss¸¦ Áõ°¡½ÃŲ´Ù. ÇÏÁö¸¸ ¿Ïº®ÇÑ LRU¸¦ hardware·Î ±¸ÇöÇÏ´Â °ÍÀº ¾î·Æ±â ¶§¹®¿¡ ½ÇÁ¦·Î random replacement¸¦ ¸¹ÀÌ »ç¿ëÇÑ´Ù°í ÇÑ´Ù. Àüü data miss rate Â÷¿ø¿¡¼ º¸¸é random replacement¸¦ »ç¿ëÇØ¼ hardware complexity¸¦ ÁÙÀÌ´Â °ÍÀÌ miss rate¿¡ ±×¸® Å« Ÿ°ÝÀ» ÁÖÁö´Â ¾ÊÀ½À» º¼ ¼ö ÀÖ´Ù.
Âü°í·Î, page replacement¿¡¼´Â page faultÀÇ ºñ¿ëÀÌ ¸Å¿ì Å©±â ¶§¹®¿¡ (millions of cycles), software·Î º¹ÀâÇÑ LRU algorithmÀ» ±¸ÇöÇÏ´Â °ÍÀÌ ÃæºÐÇÑ °¡Ä¡°¡ ÀÖ´Ù.
4. bzip L1 Cache Simulation
bzip ½ÇÇà ÆÄÀÏÀ» L1 cache¿¡ ´ëÇØ¼¸¸ simulationÇÑ Àüü °á°ú´Â ´ÙÀ½°ú °°´Ù:
|
Cache |
Block |
Assoc |
Policy |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8192 |
16 |
4 |
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
16384 |
16 |
4 |
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
32768 |
16 |
4 |
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101172 |
101138 |
34 |
0 |
|
16384 |
8 |
4 |
LRU |
0.0024 |
702 |
702 |
0 |
0 |
0.4991 |
103706 |
103661 |
45 |
0 |
|
16384 |
32 |
4 |
LRU |
0.0007 |
218 |
218 |
0 |
0 |
0.4768 |
99079 |
99049 |
30 |
0 |
|
16384 |
16 |
2 |
LRU |
0.0014 |
398 |
388 |
0 |
10 |
0.4869 |
101174 |
101138 |
36 |
0 |
|
16384 |
16 |
8 |
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
16384 |
16 |
4 |
rdm |
0.0013 |
391 |
388 |
0 |
3 |
0.4869 |
101177 |
101138 |
30 |
9 |
|
131072 |
16 |
4 |
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4867 |
101138 |
101138 |
0 |
0 |
¾Õ¼ º» cc1ÀÇ °á°ú¿Í´Â ¸¹ÀÌ ´Ù¸¥ ¸ð½ÀÀÌ´Ù. ¿ì¼± instruction cacheÀÇ miss rateÀÌ 0.1%·Î cc1ÀÇ 1/20 ¼öÁØÀÌ´Ù. bzipÀº data compression programÀ̱⠶§¹®¿¡ °°Àº loop¸¦ µµ´Â instruction locality°¡ compiler programº¸´Ù ÈξÀ Å©±â ¶§¹®ÀÌ´Ù. ¹Ý¸é data cache miss rate´Â ¹«·Á °ÅÀÇ 50%¿¡ À°¹ÚÇÑ´Ù. ÀÌ´Â ±âº»ÀûÀ¸·Î data compressionÀÌ program ÀÚü¿¡¼ ÇÒ´çÇÑ memory¸¦ ¾²´Â ºñÁß º¸´Ù, compressionÇÏ´Â dataÀÇ memory¸¦ ¾²´Â ºñÁßÀÌ ÈξÀ Å©±â ¶§¹®ÀÌ´Ù. ±×¸®°í compression dataÀÇ ¿ë·®ÀÌ cacheÀÇ ¿ë·®À» ÈξÀ ¿ôµ¹±â ¶§¹®¿¡ (simulation¿¡ »ç¿ëÇÑ graphic.inputÀº 6.6MB) cachingÀÌ È¿°ú¸¦ ¹ßÈÖÇϱ⠾î·Á¿î »óȲÀÌ´Ù. ÀÌ ±Ùº»ÀûÀÎ ÇѰè´Â data cacheÀÇ size, block size, associativity Áõ°¡ ¸ðµÎ¿¡ ºÒ±¸Çϰí miss rate°¡ Å©°Ô ÁÙÁö ¾Ê´Â »ç½Ç¿¡¼ ÀçÂ÷ ´À³¥ ¼ö ÀÖ´Ù.
±×·¡µµ °¢ ¿ä¼ÒÀÇ ¼¼ºÎ ¿µÇâÀ» µû·Î »ìÆìº¸°Ú´Ù.
1) Cache size
|
Cache Size |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8192 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
16384 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
32768 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101172 |
101138 |
34 |
0 |
|
131072 |
0.0013 |
388 |
388 |
0 |
0 |
0.4867 |
101138 |
101138 |
0 |
0 |
Cache sizeÀÇ Áõ°¡¿¡ µû¸¥ ¿µÇâÀº °ÅÀÇ Àü¹«ÇÏ´Ù. ¾ÖÃÊ¿¡ cache size Áõ°¡·Î ÁÙ¾îµé capacity miss³ª conflict miss°¡ ¾ø±â ¶§¹®ÀÌ´Ù. Data cacheÀÇ compulsory miss´Â applicationÀÇ ´ë¿ë·® input ÀÚüÀÇ ÇѰ迡 ±âÀÎÇϸç, instruction cache¿¡µµ warming ÀÌÈÄ¿¡ »õ·Î reference µÇ¾î¾ß ÇÏ´Â 388°³ÀÇ blockÀÌ ÀÖ´Â »óȲÀÌ´Ù. Cache size¸¦ 32KB±îÁö ´Ã¸®¸é, data cacheÀÇ capacity miss°¡ 35¿¡¼ 34·Î ÁÙ¾îµå´Â ¾ÆÁÖ ¹Ì¹ÌÇÑ È¿°ú¸¸ ÀÖ´Ù.
Ãß°¡·Î cache size°¡ 128KBÀÎ °æ¿ìµµ ½ÇÇèÇØ º¸¾Ò´Âµ¥, ÀÌ ¶§¿¡´Â data cacheÀÇ capacity miss°¡ 0ÀÌ µÈ´Ù. µû¶ó¼ ÀÌ ÀÌ»óÀ¸·Î cache size¸¦ Áõ°¡½ÃŰ´Â °ÍÀº ÇöÀç application¿¡¼´Â È®½ÇÈ÷ ´õ ÀÌ»ó Àǹ̰¡ ¾ø´Ù´Â °ÍÀ» ¾Ë ¼ö ÀÖ´Ù.
2) Block size
|
Block Size |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
8 |
0.0024 |
702 |
702 |
0 |
0 |
0.4991 |
103706 |
103661 |
45 |
0 |
|
16 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
32 |
0.0007 |
218 |
218 |
0 |
0 |
0.4768 |
99079 |
99049 |
30 |
0 |
Block size¸¦ ´Ã¸®¸é ¾Õ¼ cc1ÀÇ °á°ú¿¡¼µµ ºÃµíÀÌ ¸ðµç conflict°¡ ÁÙ¾îµç´Ù. ÇÏÁö¸¸ data cacheÀÇ compulsory miss ±×¸® Å« ¿µÇâÀº ¾ø´Ù. (6.6MB¸¦ 32B size·Î Àо ¾à 2¹é¸¸ ¹ø Àоî¾ß ÇÑ´Ù). Block size¸¦ ´Ã¸®¸é read time°ú write miss½Ã extra read¿¡ penalty°¡ ÀÖÀ½¿¡ À¯ÀÇÇÑ´Ù.
3) Associativity
|
Assoc |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
2 |
0.0013 |
388 |
388 |
0 |
10 |
0.4869 |
101174 |
101138 |
36 |
0 |
|
4 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101172 |
101138 |
34 |
0 |
|
8 |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
ÀÌ¹Ì conflict miss°¡ ¾ø±â ¶§¹®¿¡ associativity¸¦ Áõ°¡½ÃŰ´Â °ÍÀº ±àÁ¤ÀûÀÎ È¿°ú´Â ¾ø°í, hardware complexity¸¸ ´Ã¸°´Ù.
4) Replacement Policy
|
Policy |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
LRU |
0.0013 |
388 |
388 |
0 |
0 |
0.4869 |
101173 |
101138 |
35 |
0 |
|
rdm |
0.0013 |
391 |
388 |
0 |
3 |
0.4869 |
101177 |
101138 |
30 |
9 |
cc1¿¡¼¿Í ¸¶Âù°¡Áö·Î random policy´Â LRUº¸´Ù ¼º´ÉÀÌ ³ªºü¼ conflict miss¸¦ ¾à°£ Áõ°¡½ÃŲ´Ù. ÇÏÁö¸¸ Áõ°¡ ÆøÀº ¸Å¿ì ÀÛÀ¸¹Ç·Î random replacementÀÇ simpler hardware¸¦ À§ÇØ ÃæºÐÈ÷ °¨¼öÇÒ¸¸ÇÑ Á¤µµ·Î º¸ÀδÙ.
5. cc1 L2 Unified Cache Simulation
¾Õ¼± ½ÇÇèÀÇ L1 cache¿¡ unified L2 cache¸¦ ºÙ¿©¼ ½ÇÇèÇÑ °á°ú´Â ´ÙÀ½°ú °°´Ù:
|
Cache |
Block |
Unified |
Instruction |
Data |
||||||
|
Rate |
Miss |
Try |
Rate |
Miss |
Try |
Rate |
Miss |
Try |
||
|
1048576 |
16 |
0.6307 |
17389 |
27573 |
0.7748 |
4365 |
5634 |
0.5936 |
13024 |
21939 |
|
2097152 |
16 |
0.6307 |
17389 |
27573 |
0.7748 |
4365 |
5634 |
0.5936 |
13024 |
21939 |
|
4194304 |
16 |
0.6307 |
17389 |
27573 |
0.7748 |
4365 |
5634 |
0.5936 |
13024 |
21939 |
|
2097152 |
8 |
0.6307 |
34778 |
55146 |
0.7748 |
8730 |
11268 |
0.5936 |
26048 |
43878 |
|
2097152 |
32 |
0.3307 |
9119 |
27573 |
0.4027 |
2370 |
5634 |
0.3076 |
6749 |
21939 |
|
2097152 |
64 |
0.1777 |
4899 |
27573 |
0.2396 |
1350 |
5634 |
0.1618 |
3549 |
21939 |
¿ì¼± L2 cacheÀÇ size¸¦ Áõ°¡½ÃŰ´Â °ÍÀº È¿°ú°¡ ¾øÀ½À» È®ÀÎÇÒ ¼ö ÀÖ´Ù. ÀÌ´Â ´ëºÎºÐÀÇ L1 cache miss°¡ compulsory missÀ̱⠶§¹®¿¡ ´ÙÀ½ ´Ü°è¿¡ L2 cache°¡ ÀÖ´Ù°í ÇÏ´õ¶óµµ, L2 cache ¾È¿¡ ±× ³»¿ëÀÌ ÀÖÀ» ¼ö ¾ø±â ¶§¹®ÀÎ °ÍÀ¸·Î ¿¹»óµÈ´Ù. Áï ÀÌ¹Ì 1MBÀÇ L2 cache´Â L1 cache¿¡¼ ó¸®ÇÏÁö ¸øÇÏ´Â capacity miss³ª conflict miss¸¦ ó¸®Çϱ⿡ ÃæºÐÇÑ Å©±âÀ̰í, compulsory miss¸¸ È긮°í ÀÖ´Â »óȲÀÌ´Ù. ÀÌ´Â L1 cacheÀÇ miss Á¾·ù¿¡¼ º¸´Ù ºÐ¸íÈ÷ È®ÀÎÇÒ ¼ö ÀÖ´Ù:
|
Cache Size |
Instr |
Misses |
Comp |
Capacity |
Conflict |
Data |
Misses |
Comp |
Capacity |
Conflict |
|
16384 |
0.0154 |
5634 |
4365 |
373 |
896 |
0.1038 |
13920 |
13024 |
769 |
127 |
ÆÄ¶õ»öÀ¸·Î Ç¥½ÃÇÑ L1 cacheÀÇ Àüü miss¿Í compulsory miss°¡ L2 cacheÀÇ instruction try¿Í L2 cache instruction miss¿Í Á¤È®È÷ ÀÏÄ¡ÇÑ´Ù. Data cacheÀÇ °æ¿ì¿¡µµ miss´Â L1 cacheÀÇ compulsory miss¿Í Á¤È®È÷ ÀÏÄ¡ÇÑ´Ù. L2 cache instruction try°¡ L1 data cacheÀÇ total missº¸´Ù ´Ù¼Ò Å« ÀÌÀ¯´Â Àß ¸ð¸£°Ú´Ù.
¿ÀÈ÷·Á L2 cache size Áõ°¡¿¡ µû¸¥ miss rate °¨¼Ò´Â 1MBº¸´Ù ÀÛÀº size¿¡¼ È®ÀÎÇÒ ¼ö ÀÖ´Ù:
|
Cache |
Block |
Unified |
Inst |
Data |
||||||
|
Rate |
Miss |
Try |
Rate |
Miss |
Try |
Rate |
Miss |
Try |
||
|
32768 |
16 |
0.8552 |
23580 |
27573 |
0.8976 |
5057 |
5634 |
0.8443 |
18523 |
21939 |
|
65536 |
16 |
0.666 |
18364 |
27573 |
0.8431 |
4750 |
5634 |
0.6205 |
13614 |
21939 |
|
131072 |
16 |
0.6398 |
17642 |
27573 |
0.8021 |
4519 |
5634 |
0.5982 |
13102 |
21939 |
|
1048576 |
16 |
0.6307 |
17389 |
27573 |
0.7748 |
4365 |
5634 |
0.5936 |
13024 |
21939 |
L2 cache size°¡ 16KBÀ̸é L1 cache size¿Í µ¿ÀÏÇϱ⠶§¹®¿¡ L2 cache°¡ ¾ø´Â °Í°ú ¸¶Âù°¡ÁöÀÌ´Ù. L2 cache size°¡ 32KB¿Í 64KB°¡ µÇ¸é L1 cacheÀÇ capacity miss¿Í conflict miss¸¦ º¸¿ÏÇϱ⠶§¹®¿¡ miss rate°¡ °¨¼ÒÇÏ´Â °ÍÀ» º¼ ¼ö ÀÖ´Ù. ÇÏÁö¸¸ L2 cache size°¡ 128KB°¡ µÇ¸é ÀÌ¹Ì 1MBÀÏ ¶§¿Í miss rateÀÇ °³¼± ÆøÀº ¸Å¿ì Á¼À» °ÍÀ̶ó°í ¿¹»óÇÒ ¼ö ÀÖ´Ù. 1MBÀÏ ¶§´Â ¾Õ¼ º¸¾ÒµíÀÌ ¸ðµç capacity miss¿Í conflict miss¸¦ ó¸®Çϰí, compulsory miss¸¸ È긮´Â °æ¿ìÀÌ´Ù.
L2 cacheÀÇ block size Á¶Á¤Àº miss rate¿¡´Â ±àÁ¤ÀûÀÎ È¿°ú¸¦ ¹ßÈÖÇÑ´Ù. ÀÌ´Â ÇѹøÀÇ miss¿¡ º¸´Ù ¸¹Àº data¸¦ readÇϱ⠶§¹®¿¡ ´ç¿¬ÇÑ Çö»óÀÌ´Ù. ´ë½Å read timeÀÌ ±æ¾îÁö°í, write miss½Ã extra penalty°¡ ÀÖÀ½Àº ¾Õ¼ 2¹øÀ̳ª ¾ð±ÞÇÏ¿´´Ù. Block size Á¶Á¤ÀÇ ÇÑ °¡Áö ƯÀÌ »çÇ×Àº À§ Ç¥¿¡ ³ì»öÀ¸·Î Ç¥½ÃÇÑ ºÎºÐÀÌ´Ù. L1 cacheÀÇ block size°¡ 16Àε¥, L2 cacheÀÇ block size°¡ 8ÀÌ µÇ¸é, L1 cache 1ȸ miss¿¡ L2 cache¿¡ 2ȸ read°¡ ÇÊ¿äÇÏ°Ô µÈ´Ù. µû¶ó¼ miss rate´Â µ¿ÀÏÇÏÁö¸¸, try¿Í miss Ƚ¼ö´Â ¸ðµÎ block size°¡ 16ÀÏ ¶§º¸´Ù 2¹è Áõ°¡ÇÏ´Â °ÍÀ» È®ÀÎÇÒ ¼ö ÀÖ´Ù.
6. bzip L2 Unified Cache Simulation
bzip¿¡ ´ëÇØ ¸¶Âù°¡Áö·Î ¾Õ¼± ½ÇÇèÀÇ L1 cache¿¡ unified L2 cache¸¦ ºÙ¿©¼ ½ÇÇèÇÑ °á°ú´Â ´ÙÀ½°ú °°´Ù:
|
Cache |
Block |
Unified |
Inst |
Data |
||||||
|
Rate |
Miss |
Try |
Rate |
Miss |
Try |
Rate |
Miss |
Try |
||
|
1048576 |
16 |
0.5111 |
101526 |
198647 |
1 |
388 |
388 |
0.5101 |
101138 |
198259 |
|
2097152 |
16 |
0.5111 |
101526 |
198647 |
1 |
388 |
388 |
0.5101 |
101138 |
198259 |
|
4194304 |
16 |
0.5111 |
101526 |
198647 |
1 |
388 |
388 |
0.5101 |
101138 |
198259 |
|
2097152 |
8 |
0.5111 |
203052 |
397294 |
1 |
776 |
776 |
0.5101 |
202276 |
396518 |
|
2097152 |
32 |
0.4997 |
99271 |
198647 |
0.5619 |
218 |
388 |
0.4996 |
99053 |
198259 |
|
2097152 |
64 |
0.494 |
98136 |
198647 |
0.3557 |
138 |
388 |
0.4943 |
97998 |
198259 |
bzipÀÇ L2 cache simulation °á°ú ¿ª½Ã cc1ÀÇ L2 cache simulation °á°ú¿Í Å©°Ô ´Ù¸£Áö ¾Ê´Ù. Block size Áõ°¡¿¡ µû¸¥ miss rate°¡ °¨¼ÒÇϵÇ, L2 cache block size°¡ L1 cache block sizeº¸´Ù ÀÛ¾ÆÁö¸é ±×¸¸Å try¿Í miss Ƚ¼ö°¡ ¹è°¡ µÈ´Ù (³ì»öÀ¸·Î Ç¥½Ã). ±×¸®°í bzip application¿¡¼ L1 cache miss´Â ´ëºÎºÐ compulsory missÀ̱⠶§¹®¿¡ L2 cache size¸¦ ´Ã¸°´Ù°í ÇØ°áµÇÁö ¾Ê´Â´Ù. ƯÈ÷ instructionÀÇ °æ¿ì¿¡´Â L1 instruction cacheÀÇ miss°¡ ÀüºÎ compulsory missÀ̱⠶§¹®¿¡ L2 cache¿¡¼ 100% miss ³¯ ¼ö ¹Û¿¡ ¾ø´Ù. bzipÀÇ °æ¿ì¿¡µµ L2 cache size¸¦ L1 cache size¿Í °¡±õ°Ô °¡Á®°¡¸é L2 cache size Áõ°¡¿¡ µû¶ó capacity miss, conflict miss¸¦ ó¸®Çؼ miss rate°¡ ¾à°£ ÁÙ±ä ÇÒ °ÍÀÌ´Ù. ÇÏÁö¸¸ bzip applicationÀÇ capacity miss, conflict miss°¡ Àüü missÀÇ 0.1%µµ ¾È µÇ±â ¶§¹®¿¡ ±× È¿°ú´Â ±ØÈ÷ ¹Ì¹ÌÇÒ °ÍÀÌ´Ù.
7. Summary
ÀÌ»ó Dinero cache simulationÀ» ÅëÇØ cache size, block size, associativity¿Í replacement policy°¡ °¢°¢ compulsory miss, capacity miss, conflict miss¿¡ ¾î¶² ¿µÇâÀ» ¹ÌÄ¡´ÂÁö ½ÇÇèÀûÀ¸·Î Áõ¸íÇÒ ¼ö ÀÖ¾ú´Ù. Dinero´Â ´ÜÁö cache miss rate¸¸ »êÃâÇϴµ¥, À§ 4°³ Ư¼ºÀÇ º¯È¿¡ µû¶ó hit time, miss time°ú hardware complexity°¡ º¯ÈÇÑ´Ù´Â °ÍÀ» ¿°µÎ¿¡ µÎ°í, ÃÖÀûÀÇ ¼³°è¸¦ ¼±ÅÃÇØ¾ß ÇÒ °ÍÀÌ´Ù.
¶ÇÇÑ cc1°ú bzipÀÇ ºñ±³¿¡¼ º¸¾ÒµíÀÌ, application¿¡ µû¶ó cacheÀÇ ¼º´ÉÀÌ ¸¹ÀÌ ´Þ¶óÁø´Ù´Â °Í ¿ª½Ã ½ÇÇèÀûÀ¸·Î È®ÀÎÇÏ¿´´Ù. Àü¹ÝÀûÀ¸·Î ¼ö¾÷¿¡¼ ¹è¿ü´ø cache Áö½ÄµéÀ» ÀçÂ÷ È®ÀÎÇϰí, application¿¡ µû¶ó ¾î¶»°Ô ´Þ¶óÁö´ÂÁö ´À³¥ ¼ö ÀÖ´Â À¯ÀÍÇÑ ½ÇÇèÀ̾ú´Ù.
8. References
http://pages.cs.wisc.edu/~markhill/DineroIV/
http://rogue.colorado.edu/pin/