Saturday, January 15, 2011

Let's bake power states!

So we now know what benefits undervolting might get us and how we do it with our precious Core or Core 2 CPU. We also know that the operating voltage changes all the time, as does the clock rate, and I hope why constantly adjusting voltage and clock is a good thing does not need a further explanation.
So how should this mechanism behave to not slow your system down and save power?

The short answer is you will likely never know the precise and correct answer but have to consider some things to avoid some pits lots of people fall into. Out of the box the power states and the rules for their transition are very well done. Chances are what you define yourself is worse, the question is just by how much.
Basically you want to rise the clock speed whenever the power consumption of the system as a whole integrated over time is smaller with a faster CPU that drains more power. This is obviously the case when your task is CPU bound, e.g. video rendering. Besides the CPU you also have the power drain of your running harddisk, the chipset, the memory, maybe your screen, the NICs and whatever else may be running. Let's say this takes 15W and your CPU takes another 10W running at half speed. The job takes x minutes to finish. Running the CPU at full speed, it uses 15W and the job takes roughly x/2 minutes. So in one case we have 25W for x minutes and in the other case we have 30W for x/2 minutes. So we end with 25*x Wminutes and 30*x*1/2 Wminutes, equaling 15*x Wminutes. Yes, this is very trivial but a lot of people do not understand this without a hammer.
If you thought about the numbers for CPU power draw used in this example you must come to the conclusion that the CPU runs way less efficiently on a lower clock rate. And this is actually true for most undervolted systems that are not overclocked. You can draw a diagram that correlates minimal power consumption for stable work and processing power and you will see it is not a line you get, but something curvy.
It will roughly look like this with power consumption being on the y-axis and processing power being on the x-axis, both assumed to be linear. The left end is intended to be a clock rate of about 0Hz, the right end something severely overclocked. The y-axis is not starting with 0, as even with basically no real clock the CPU will draw some power and need a certain minimum voltage to function at all.
The bottom line of this is that for every CPU there is a point from which you will need a lot more voltage and hence power to make it work faster. Go look for those overclocking freaks that reach 5GHz with some i7 and see what crazy voltages and cooling setups they use. Some of those systems have the CPU use 3 times the stock power and more. In terms of efficiency that is horrible, as they aren't running even twice as fast in some cases.
Around the point where the CPU needs rather suddenly a lot more voltage to operate faster, there will be a point where lowering the clock speed will not really reduce the necessary voltage for it to perform stable. Between those points, which are more intervals than points, there's the point of maximum efficiency, where you get a good speed for cheap power.
Although the diagram for each CPU will very roughly like the one above, where your stock speed is can barely be foreseen. And keep in mind stock speed usually has not the minimal power consumption as stock voltage might be way more than the CPU really needs. So you really have to figure out where on this curve you are with your own CPU. The one I talked about in the last post has its stock speed definitely in the left half of the diagram. Between lowest and highest clock there's only a tenth of a Volt more needed to keep it running stable! Keep in mind that P=U^2/R is not applying as a rule of thumb if you compare different clock speeds, though. The higher the clock rate, the more the electrical capacity of the CPU comes into play, wasting power.

Depending on how steep the needed increase in voltage is for your CPU, you should either make it aggressively increase clock speed or not so aggressively. But in general you do want it to clock up quite soon. For one because this makes the system more responsive and eliminates possible perform impacts on other hardware that might have to wait for the CPU to finish its work and also because highest efficiency is typically close to stock speed. There's another effect comping into play that you might call "the race to sleep". When the CPU has done its work, it very quickly enters a sleep state where the power consumption drops to something damn low. And it is more effective for it to work fast and sleep more than to work slower and sleep less.
If you have full load at half speed you need e.g. 10W. If you go full speed with half load you have 15W during the work times and let's say 1W during sleep. So you have 15/2+1/2W resulting in an 8W average. Yes, we leave out a lot of details here, assuming load drops to 50% and not figuring in any time for the sleep state entering, but the direction's the thing that's important.

So you now have an idea about some of the mechanisms that you should consider when deciding how your power state transitions should be defined and you will likely not make the same mistake as a lot of those kids around the interwebs that force their undervolted CPUs to perform worse and less efficient than they could. And if you think the directions are very vague, then you are absolutely right. there's so much stuff going on in detail that your setup will be not ideal most of the time, no matter what you choose as parameters. And again the most benefit comes from choosing the right general direction of behaviour, not the last tenth of a volt or 5% of load you alter. Maybe I feel like writing something about stability testing later, as this is the key to really use undervolting and not just play with it.

No comments:

Post a Comment