I'm busy working on my blog posts. Watch this space!
Have you got the X factor? (part 1)
August 23, 2018
OK ... nothing to do with the TV show, it's a not quite so subtle pun on a topic all about factors and factorising integers. It's a very basic "factorising 101" post which we will build upon in the days to come until you are a factorising champion.
One of the classic examples of things in mathematics which are relatively much harder to "do" than they are to "check" is that of factorising large integers. Indeed, the topic is of huge interest since a lot of the cryptographic algorithms used to encrypt digital data depend on this very fact. As of the time of writing, there is no lightning-fast method of factorising large integers and anybody who can devise such a method would surely be in for a lot of fame, and possibly fortune as well!
By way of example, suppose I asked you which two numbers multiplied together give you 309709. That's quite hard for you to do with just a pen and paper, but if I asked you check if the answer 317 X 977 was correct (spoiler: it is) then it would not take you very long at all. Now, for a computer, breaking down 309709 into its prime factors can be done in the blink of an eye. But make the starting number huge, for example :
.. and you will see that even for a computer it is suddenly not so easy. The answer, for the curious is (I didn't personally work it out I just copied it and trusted the RSA source) 37975227936943673922808872755445627854565536638199 multiplied by 40094690950920881030683735292761468389214899724061 both of which are prime numbers (I took their word on this one)
In practice, even this number may be considered somewhat on the "easy" side and a much bigger number may be chosen to give the encryption more strength. Wikipedia has an excellent article on the topic - look for the article on "RSA numbers". The large RSA numbers have not been factorised by anyone except their original creator (someone working for RSA Security LLC) even though some of them carry large cash prizes.
Before we go further, it's time to bust a little myth concerning the factorising of integers and their alleged difficulty. In fact, factorising integers, generally speaking, is incredibly easy!
Seriously? If that's the case, then why the big fuss over internet security? Well, think of it like this: half the (positive) integers are even numbers. One in three of them is divisible by 3. One in five of them is divisible by 5. If you take the first thousand integers and apply this thinking, you can calculate how many of them are divisible by 2, 3 or 5. There's a bit of double-counting to take care of (for example, the number 30 is divisible by 2, 3 and 5) but for anyone who is familiar with working with sets, I'll just leave this here as a hint ...
and for those of you who just want to take my word for it, out of the first 1000 integers, 734 of them - ie nearly three quarters! - are divisible straight away by 2, 3 or 5. (And most of the rest of them are primes, and checking if a number is prime is not quite so hard as it sounds, I'll cover that in a later article)
Hence we arrive at a critical distinction : factorising integers generated randomly is generally an easy process. On the other hand, factorising numbers which have been deliberately chosen to be difficult (e.g. by multiplying together two huge primes) is a process which can be very, very tough indeed.
Okay, so if we are faced with a large integer and have the task of factorising it, how should we go about it? One obvious technique which leaps out at you straight away is by trial division. Start by dividing the number by 2 and see if the answer is a whole number without any fractional part. If that fails, try dividing by 3, then by 4, then by 5 and so on. If you get all the way up to the integer itself without hitting lucky, then the original number must be a prime since no smaller number will divide into it.
I'm sure many of you will have spotted a couple of ways we can improve this algorithm immediately. Firstly, we only need go up as far as the square root of the original number (since a number cannot be the product of two numbers both of which are higher than the square root) and secondly, we only need to do trial division by the primes, for example if a number is not divisible by 2 or by 3 then we already know it is pointless trying to see if the number is divisible by 6.
Due to the modern, practical use in digital cryptography, a lot of development in the field of factorising integers has come within most recent decades, however this was an issue which was very much on the minds of very much earlier mathematicians. For example, we know that Fermat was wrestling with the problem in the 1600s and that Euler was wrestling with it in the 1700s. Both of them came up with some incredibly ingenious ways of factorising integers, which, while neither of them necessarily blisteringly fast, can be a massive improvement on the "trial division" method described above. Let's take a look at Fermat's method:
Fermat realised that if a number N is the product of two integers c and d, then he could rearrange the formula N = cd into
In particular, if the number we are wanting to factorise is an odd number (all numbers we are interested in factorising are odd numbers, since finding the first factor of an even number is easy - think about it!) then it means that both factors c and d will also be odd numbers.
This in turn means that (c+d) and (c-d) will both be even numbers which then in turn means that (c+d)/2 and (c-d)/2 will both be integers. In other words, every odd integer can be represented as the difference between two perfect squares.
So, let's say we have some N which we want to factorise. We know it's the difference between two squares (let's call them a and b) so
N = a^2 - b^2 (where ^2 is the symbol for "to the power of 2")
At first glance, this doesn't look like this has got us any further forward at all and Fermat is just wasting both his and our time! But here's how we can proceed:
We will just rearrange the above equation slightly into: a^2 - N = b^2
and then do this...
1. Start with a value of a which is the next whole number up from the square root of N
2. Square it and then subtract N
3. See if the resulting answer is a perfect square.
4. If it is - we've found a and b so the two factors of N are (a+b) and (a-b)
5. If not, increase the value of a by one and go back to stage 2 and repeat as many times as necessary.
Is that going to be any quicker? Let's see how that works by finding the factors of 39913. The square root of this number is 199.78... so let's kick off from the next biggest integer which is 200
200^2 - 39913 = 87 = not a square
201^2 - 39913 = 488 = not a square
202^2 - 39913 = 891 = not a square
203^2 - 39913 = 1296 = 36 squared <-- found it
So 39913 = 2032 - 362 from which we can quickly calculate that 39913 = 167 X 239
The really, really surprising thing about the whole process is that we found the factors of a five digit number in just four steps!
Some algorithms basically go about the same thing but in a slightly different way by rearranging the original equation into N + b^2 = a^2
This slightly modified algorithm is then to start at b = 1, square it and add it to N and see if the result is a square. In this case, it would take us 6 steps to get to
39913 + 6^2 = 203^2 from which we deduce the factors in the same way.
It may seem a relatively difficult process to decide whether or not any given integer is a perfect square or not, but there are a few nifty little tricks to speed up the process. For example, if you look at the last two digits of any integer, there are only 22 combinations of the last 2 digits which could potentially make the number a perfect square.
Just time before we go for now to bust a second myth about factorising integers : that in order to make the process of factorising as long as possible you should pick two numbers close to the square root. That is very certainly the case if the person trying to factor the integer is using the method of trial division starting from 2 and working upwards. However, if the person is using Fermat's factorisation method above, then the reverse is true - Fermat's method actually comes to a solution quicker when the two factors are clustered near to the square root!
That's enough for now. In the next article we are going to move up out of first gear and onto more advanced factorisation methods. So, thanks for reading and watch this space!