majithia23 Posted May 2, 2011 Share Posted May 2, 2011 Question: Intel's Sandy Bridge launch just brought its desktop CPU line up to 3.8GHz, but I remember that the Pentium 4 got up to 3.8GHz before being cancelled. So why is it that Sandy Bridge is just now getting to the clock speed levels that the Pentium 4 was at years ago? And how is it that Sandy Bridge still manages to outperform the older Pentium 4, even though it has a lower clock speed? The relationship between clockspeed and performance isn't nearly as straightforward as it used to seem—not that it ever was all that simple. To understand why different CPUs at different clockspeeds perform in different ways, we'll first look at how the CPU processes instructions. A CPU processes instructions in an assembly-line manner, with different instructions existing in different stages of completion as they move down the line. For instance, each instruction on the original Pentium passes through the following, five-stage pipeline: Prefetch/Fetch: Instructions are fetched from the instruction cache and aligned for decoding.Decode1: Instructions are decoded into the Pentium's internal instruction format. Branch prediction also takes place at this stage.Decode2: Same as above. Also, address computations take place at this stage.Execute: The integer hardware executes the instruction.Write-back: The results of the computation are written back to the register file. An instruction enters the pipeline at stage 1, and leaves it at stage 5. Since the instruction stream that flows into the CPU's front-end is an ordered sequence of instructions that are to be executed one after the other, it makes sense to feed them into the pipeline one after the other. When the pipeline is full, there is an instruction at each stage. Each pipeline stage takes one clock cycle to complete, so the smaller the clock cycle, the more instructions per second the CPU can push through its pipeline. This is why, in general, a faster clockspeed means more instructions per second and therefore higher performance. Most modern processors, however, divide their pipelines up into many more, smaller stages than the Pentium. The later iterations of the Pentium 4 had some 21 stages in their pipelines. This 21-stage pipeline accomplished the same basic steps (with some important additions for instruction reordering) as the Pentium pipeline above, but it sliced each stage into many small stages. Because each pipeline stage was smaller and took less time, the Pentium 4's clock cycles were much shorter and its clockspeed much higher. In a nutshell, the Pentium 4 took many more clock cycles to do the same amount of work as the original Pentium, so its clockspeed was much higher for the equivalent amount of work. This is one core reason why there's little point in comparing clockspeeds across different processor architectures and families—the amount of work done per clock cycle is different for each architecture, so the relationship between clockspeed and performance (measured in instructions per second) is different. Now, the clockspeed-to-performance ratio is stable within the same family of processors, so a 3.4GHz Core i5 CPU will outperform a 3.1GHz Core i5 CPU, all else being equal. A closer look: stalls, flushes, and pipeline length Astute readers who have thought through the above might come to the conclusion that a 3.8GHz Pentium 4 should, on average, still perform equally as well as a 3.8GHz Sandy Bridge, even if the former's pipeline is longer. Why? Because if both pipelines are full, the number of instructions per second that pop out the end of each pipeline is the same. Think about it: if one company is operating a 21-stage assembly line at 1 stage per second, and another is operating a 12-stage line at one stage per second, then they'll both still produce one finished product every second when both pipelines are full. But notice the last part of the previous sentence: "when both pipelines are full." Every time the processor switches threads or mispredicts a branch, it has to flush its pipeline and refill it. And sometimes, instructions get stalled in the pipeline for multiple cycles, leaving the downstream stages idle, with nothing to do. These flushes and stalls are a key reason why, at a given clockspeed, it's better to have a shorter pipeline than a faster one—the shorter pipeline takes less time to refill after a flush, and it's quicker to begin completing instructions again after a stall. At 3.8GHz, the Core i5's shorter pipeline, which does more work per stage, can beat the pants off the Pentium 4's longer pipeline, because anytime the pipeline goes even partially empty the Core i5 can refill it and recover much faster. The end result is that, on average, the Core i5's pipeline stays full longer than the Pentium 4's, which makes the Core i5 faster. Of course, a shorter pipeline isn't the only reason the Core i5's pipeline finds itself full more often than the Pentium 4's. Other reasons include a superior branch predictor, which keeps performance-killing mispredicts to a minimum, and larger caches, which keep the Core i5's pipeline fed with readily accessible instructions. Summing it all up When comparing CPU clockspeeds within the exact same processor family, clockspeed is a good guide to performance because a higher clockspeed means more instructions are completed per second. But when comparing the CPU clockspeeds of different processor designs, it's generally apples-to-oranges. For two CPUs with the same clockspeed, a shorter pipeline has an advantage because it stays full more often. For two CPUs with the same pipeline depth but different clockspeeds, the higher clockspeed gives it an advantage. For two CPUs with different clockspeeds, it depends on the pipeline depth and other factors. Complete Article Link to comment Share on other sites More sharing options...
shought Posted May 3, 2011 Share Posted May 3, 2011 Very intersting topic :)I knew that you couldn't just compare clock speeds, but this really explains it. Link to comment Share on other sites More sharing options...
majithia23 Posted May 3, 2011 Author Share Posted May 3, 2011 @shought ..interesting indeed ...atleast you knew the fact , me like most others too ( i think so ...) , never knew this ..... :Pjust knew the cliche ,as it goes - Faster = Better :: Slower = Bad .but not now , .... :rolleyes: Link to comment Share on other sites More sharing options...
toyo Posted May 4, 2011 Share Posted May 4, 2011 tztztz... young users... obviously you didn't went through the whole Pentium IV fiasco. :lol: Link to comment Share on other sites More sharing options...
Bizarre™ Posted May 4, 2011 Share Posted May 4, 2011 CPU speed is not only determined by frequency, but also CPU architecture and PC peripherals.As for me, I only buy a new CPU when another generation of CPU is released. It's much cheaper this way. Link to comment Share on other sites More sharing options...
majithia23 Posted May 4, 2011 Author Share Posted May 4, 2011 @toyo and Biz , thats the experience and the wise talking .... :) ( ps - and now when the Sandy Bridge have arrived previous i7 should be quite reasonably prized , and with the launch of the AMD Bulldozer's , the Phenom range should be going for peanuts .... :P ) Link to comment Share on other sites More sharing options...
Bizarre™ Posted May 4, 2011 Share Posted May 4, 2011 @majithia23:Indeed. Although in my case, it'll be a while before I make another purchase since my current CPU works just fine. Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.