Well, it's actually much more complicated than what I typed and difficult to understand without knowing about Fourier Series and why it's used.
When a signal gets clipped, upper order harmonics begin to enter into the main signal. This harmonics are multiples of the original signal. That means if your original signal was a 50hz sine wave, you would have harmonics at 100hz, 150hz, 200hz, 250hz, etc.. Once you clip the signal, these harmonics become stronger and stronger. Their amplitude (the y-axis on the graph measured in dB) corresponds to the power that each harmonic adds to the original signal. So once you clip the 50hz sine wave, you're actually adding more power to the signal due to the harmonics at the other frequencies. You can hear these harmonics when you listen to the clipped signal. This will give the original signal a "crunchy" sound that we call audible distortion. This is very similar to the distortion knob on a guitar and how it adds crunch to the sound.
Now to understand THD more, think about these harmonics each adding a small amount of power to the original signal. If you add up all the power that the harmonics are producing (and exclude the original signal) and divide that by the power of the original signal, you will have a percentage. This is your THD. The lower the amplitude of the harmonics, the lower your THD will be.
This can be calculated using Fourier Series but that is pretty complicated and laborious.
Did that explain things any better?