okay so as you've probably guessed this is a talk about cryptography um i warn you up front it's quite theory heavy um not a great deal of php code and i am aware it is 4pm on friday so please try and stay in the room um so sort of introduction um what what what is it that we're talking about when we talk about cryptography so modern cryptography covers sort of three quite important areas the first one is probably the one that everyone sort of thinks of first is message privacy uh that means ensuring that any
communications between two parties can only be read by the intended recipient and the sender um another another sort of facet to modern cryptography is being able to verify a message that's ensuring that a message that you've received is is the message that was actually sent by the person that that sent it and it's not been tampered with between you and the sender and and well the final sort of main areas that it deals with is identity verification so ensuring that the message that you've received did actually come from the person who claims to have sent
it and it's not a fake message that you've you've received from someone who's trying to play a better joke on you um so so that's the third third area now if you've been to any other talks maybe at this conference or another one um on application security you mostly talk this i think this is from last year or two years ago um it looks a bit like this cryptography is really quite a hard thing to to get right especially if you start diving into designing your own algorithms and things like that um and so the
the main purpose of this talk is to take a brief journey through the evolution of cryptography right from the very very beginning all the way up to the algorithms that we're using today um and try and give you a bit of an appreciation of quite why it's so hard and and why it's important to get it right um so i'm gonna start with some historic ciphers um which you can actually do on pen and paper if you if you want to um and then i'm gonna move on to some of the inner workings of of
things like as and rsa crypto systems which are in use today um so first on to the historic ciphers the the first one to take a look at you've probably all heard of this one the caesar shift it's probably about the simplest cipher you could you could think up um and the basic idea is what you take a message with some words and in order to encrypt it you shift characters in the message up or down the alphabet by a fixed amount that looks a bit like this um and you can see we've got like
an alphabet at the top and then in order to apply this user shift transformation encryption on on a message we sort of shift each letter so an a becomes a d um and then so on so forth um it's not going to be a very short talk if this was the state of the art modern cryptography um so so what's actually wrong with this cipher why why should you not use it um well turns out that for forgiving alphabet there's a very very small number of possible keys obviously you've only got 26 letters in the
english alphabet um and that gives you 25 possible different shifts that you can use to encode a message using the scissor shift um obviously if you use a 26 when it just encodes it back to itself which is very useful either um even if you were to use like binary ascii and apply see the shift you're still only looking at 255 different possible shifts um that would mean that any anyone wanted to read a message that you've sent that you've you've encoded you're going to see the shift could simply just run through write a little
script try every different possible shift and see which which one made a message that made sense and it'd be really easy for them to to quickly run through and and decode your message this is quite important to us because it illustrates a really important aspect of a strong cipher it's the fact that it's got to have a large number of possible keys that's preventing any attacker from just iterating through them all to try and decode your message um so move on to maybe a another straightforward cipher sort of an evolution step up um is a
substitution cipher now this this works in a sort of a similar way but instead of just moving letters up we randomly shuffle them around um so you just swap the letter a letter in your your plain text for a letter for a different random letter okay and this looks a bit like this so in this this one i've sort of picked a an encoding which changes a to a z a b to a n c to q et cetera and you do that's the entire alphabet um how does it hold up this one is actually
significantly better than the see the shift our algorithm um using just english for your substitution and just letters there are four hundred and three septillion 291 sextillion 461 quintillion 126 quadrillion 605 trillion 635 billion 584 million possible different keys it's quite a big number um so with such a large large number of keys you'd think that this is a really really quite strong cipher right why aren't we why aren't we using a substitution cipher to to store people's bank details well turns out the weakness of the substitution cipher doesn't actually come from from the low
number of keys um its security is equivalent to about an 88-bit key so that's obviously not as strong as some of the modern ciphers that use 128 or 256 bits but it's still pretty strong and it's only sort of just within our computational capabilities to to brute force a key of that size um instead the substitution cipher falls foul of probably the greatest nemesis of of cryptographic algorithms which is statistics okay due to the simplicity of the cipher it it fails to hide any underlying patterns in the in the data that you've encrypted with it
which means that if you want to recover the the original plain text if you've got a message that someone's sent and you want to read it you need to just look for the patterns if the pattern's english text this is the graph of all the various letter frequencies in in the english language so what you can do is you can you can count up all the letters in a cipher text that you've you've sort of intercepted and you can plot them on a graph like this and then you can sort of take a bit of
a guess and say well the letter that occurs most commonly that's probably the letter e okay because that's the most common letter in in the english language and once you've done that you can you can get maybe a few others there's a peak at sort of like around s and t and and some of the other vowels a and o and you can sort of start to decipher bits and once you've got little bits you can maybe guess at words and say well that looks like left the word v which is one of the most
common words that gives you a couple more letters and you can start decoding it and bit by bit you can sort of piece together the the full text of the message it takes a little bit of work but you can automate it with a script quite easily and and you can decode this on paper even so it's not really very strong so move on to obviously the the substitution cycle is quite an old cipher and during some like the 1500s 1600s people sought to to improve upon this because they they sort of realized that this
this wasn't hiding the patterns quite so well and they came up with this cypher called the veganeer cipher now this is um one of a group of ciphers known as polyalphabetic ciphers it's these are so called this because instead of just using one possible encoding for each each letter in your your plain text it it uses several different encodings that helps it better disguise some of the the underlying patterns in the in the plain text how this works for the for the cipher is in order to to encode a message you need to first of
all pick a key in this simple example here i've picked the key t probably not a good idea to use that as your your actual key but it works quite nicely here for an example and and what we do is using the um the letters here we set a to be equal to the first letter of the key and we get c the shift for that we do the same for e and the same for y and then we've got several different shifts that we use one after the other that looks a little bit like
this so we've got a message that we want to hide that's pretty pretty secret message you probably don't want people people knowing that you're about to blow some up um you take the key and you encode the first letter of your message using the first seizure shift represented by the k alph letter the second one with e next one with y once you're on out of letters in your key you repeat it um and it looks like that so we we continue use it reusing this key and we get out the ciphertext at the bottom
so anyone want to guess is this cypher secure nope absolutely not now it did take a while longer to break it um and it was credited to charles babbage who was quite quite no well-known historic figure um however it wasn't until 1985 that this was actually recognized because it was he he did it and the british government kind of kept it a bit secret because they didn't want anyone to know that they'd broken it um so until then um someone else frederick kaziki had been credited with the discovery he discovered it a bit later and
it's him that the techniques actually named after unlike the simple substitution cipher you can't actually use frequency analysis if you've got a ciphertext encoder with this um and if if you were to do frequency analysis you'd probably see something with a fairly flat graph so all the letters would be providing the keys long enough you'll you'll find that the letters frequencies are fairly similar to each other and there's nothing that you can sort of pick out fairly easily however despite the fact you can't do frequency analysis because they're multiple encodings what you can do is
you can start looking for repeated sequences of letters in the cipher text now what these repeated sequences sort of tell you is possibly that's where the key has repeated in the in the cipher text and so you can count the distance between these repetitions um and you'll sort of notice that that's corresponding to the same word being encoded and being being 12 letters apart so this suggests that learnt by key is a multiple of 12. or sorry 12 is a multiple of our key length so the keyword that possibly been used to encode this text
could be 2 3 4 6 or even 12 characters long and then that absolutely helped us to narrow down what this the the key that was used to encode it might be so what you can then do is i mean i i'm cheating a bit because i know how long that is um i can take a guess that maybe the key is three letters long and i split the message into three groups taking the first letter the fourth letter etc into the first group the second letter fifth letter etcetera into the second group and the
third letter and so on into the third group you can then perform frequency analysis on those those subgroups and it should produce a graph that looks similar to the one a few slides back once you've done that you can sort of take take the graph that you're looking at try and line up the largest letter with e and then that sort of helps you to to work out what the the keyword is at that position once you've actually done that you'll be able to recover the key and decode the message okay now um when i
first get this talk i i put together this little cypher challenge a little bit more difficult in the beginning cipher but very very similar process so if anybody fancy's having a go at breaking one of these themselves um there's a link there i've got it at the end of this end of the talk so i'll put it back up if you don't manage to copy it down right now let's move on to a different cipher this one probably everyone in the room i would hope has heard of enigma it was about another hundred years after
the veneer cipher that went by before the um the enigma cipher came around um world war ii saw several attempts to mechanize cryptography so move it from algorithms that you could perform on pen and paper to actually using machines to perform cryptographic operations obviously the most famous attempt was the germans enigma machine although during the war all the major powers the americans and the british were all using machines that worked in a fairly similar way um to to encode their their text i mean the americans and the british one was a little bit more secure
than enigma um but same same sort of underlying principles so an enemy machine looked a bit like this made up of four parts a keyboard which was for sort of typing your message in you have a typist to type in um a set of rotors the back um these act each of these acted as their own substitution cipher um there were commonly three rotors in a machine but some of the machines featured up to as many as eight of these rotors that could be interchanged um the military machines that the german used had five different
rotors which they could use in any any combination so they could use like voters one two three one day and three four five the next day um and later on throughout the war they expanded this to to a set with a total of eight rotors which provided a little bit more security and the final component this at the front here was a plug board now this switched around pairs of letters um with with wires and you could use up to 13 different swaps um but only usually about 10 of them were used this actually was
one of the features that oops was uh gave the machine quite a lot of security was the ability to flip these letters um that's where quite a lot of security came from so this is sort of how it looks like inside it's a electrical machine and so when a typist presses a key on the on the front of an enigma machine an electric current completes the circuit which goes through the plug board through each one of the rotors and gets sort of like mangled up then the back of the machine there's a reflector which is
literally just a load of wires that reconnect the the circuit back through each of the three rotors back through the plug board and finally it illuminates a lamp on the on the top of the top of the machine then the the um the radio operator would send that letter and then all the rotors would move around and when you press the next time it took a different path through the machine and came up with a different letter um obviously it stepped the first rotor one at a time and once it had gone all the way
around then that's stepped over the next one once that had gone all the way around set the next one and so on and so forth that meant that each letter could be encoded in a vast different variety of different ways each time you use the same letter by changing the mappings this way it meant that the mapping of like the plain text and ciphertext was constantly changing um and it it makes it really difficult to do any sort of like frequency analysis or anything like the letters and and sort of had the um the the
crips analysis at bletchley park stumped for quite a while on on how to actually analyze these these messages obviously the enigma is quite famous but one of the things it's famous for is for having been broken um a team of british cryptographers um at bletchley park led by alan shearing who's also quite a famous guy i've heard um so designed machines to help them break the mechanical ciphers um the breakthroughs that the team made weren't particularly based on weaknesses in the cipher and then the algorithm itself but mostly on sort of operational errors made by
the germans so examples of those errors included choosing bad keys like aaa as a sort of initial setting for the rotors this made it fairly trivial to decode and having predictable message structures so for example the first message most of the german military units would send in the morning would be a weather report which would contain the german word for weather and this was a the weakness one of the weaknesses that um the device designed by alan turing sort of seized upon and it sort of looked for um the word wetter in in the decrypted
text and then sort of like stepped through all the possible different combinations of rotors and and clipboard settings to try and find this um and they had uh full like rooms full of these machines that alan shearing and his team designed that whenever they got a message in first thing in the morning one of the machine operators would rush in put the message into the machine it were away for a couple of hours it'd find out the key and once because the germans only changed the keys once a day once they'd found the key for
the day they could decrypt all the messages that the germans were sending for that day and obviously that did actually give the the british and their allies quite a big advantage in the war because they knew exactly what the germans were up to um so obviously this break was quite a a big thing so we're going to leave the um the historic ciphers there now um we've gone sort of through all the way up to sort of the beginning of the last century with the halfway through with those machines and we're going to take a
look at some of the algorithms that are actually useful to us today for for storing data securely um modern-day cryptography as i said into the beginning can be broken down into several different problems that we need to be able to solve to communicate securely um the first one of those is is confidentiality we need to ensure that people other than the intended recipient can't read our messages um there's a wide variety of different algorithms which people have built for for doing this um but the the probably one you want to be using at these at
the moment is as the advanced encryption standard most of these algorithms are symmetric that is they use the same key for both encryption and decryption there's two main classes within that stream ciphers which work on continuous streams of data and block ciphers which break the message up in separate blocks and encrypt each separately i'm going to show an example of each in a few minutes another thing that we need to solve is is key exchange obviously if i want to communicate with you securely we both need to have a key that we can we can
use to encrypt messages um so one of the one of the ways that we um we do this is with asymmetric ciphers now these ciphers that use a different key for encrypting as they do for decrypting and then we can sort of use this you can give me the encryption key fairly safely i can use that to encrypt a message to you but i can't subsequently decrypt any messages that are encrypted with that key only you can with the the other part of that key um so that's quite an important important thing that we can
do um another thing that we can do is verify the identity of a sender again it it works in a similar way to key exchange you can um sign a message using a a private key that you keep with keep to secret and you can publish a public key and you can say this is this is my public key and anyone when they receive a message from you can use that public key to verify that it was signed with your private key um it's known as a message signature and again i'm going to go into
detail on this later on this section another another thing we need to do is authenticate a message make sure that it hasn't been tampered with um for this we use cryptographic hash functions generally such as sha256 so when you can receive a message you can compute the message hash and compare it to one that's maybe been sent along with a message and and signed with the the private key of the sender if it doesn't match you can reject the message and say someone's messed with this tampered with it sending me again or or whatever obviously
you need to combine that hash with a secret key of some form otherwise if someone tampered with it they could just recompute the hash um and a final thing that quite a lot of people don't maybe realize is a part of modern cryptography is the ability to generate random numbers so a large number of secure protocols rely on being able to generate random numbers that are actually random one example is with using a public key cryptography system you might generate a random key to use the symmetric cipher and encrypt that with the with the um
public and private keys and send the whole lot along now if someone can predict the number that came out you're running a number generator they can guess what that key was and they can just forget about trying to break the algorithm they can just decrypt the message so being able to generate secure random numbers is really important okay so symmetric ciphers as i sort of previously mentioned there's two classes of symmetric cipher there's block and stream ciphers um all of these algorithms are really only useful for dealing with message confidentiality there's no no symmetrical algorithm
that you can really use for key exchange at the moment and it still doesn't really solve that problem so i'm going to start by looking at stream ciphers a stream cipher works is it produces a constant stream of sort of pseudo-random output bytes um and you use the secret key that you you you're using to encrypt the message with as a sort of a seed to this generator um the produce bytes from the the generator then exod with your plain text to produce the cipher text and then you can send that along and the person
on the other end can produce the same random pseudo-random stream of bytes and use that xord with the plain text do with the ciphertext to recover the plain text it's actually fairly similar to how enigma works in a way it's sort of stream cycle sort of evolved from the enigma machine there are several stream ciphers that we currently use today um probably the most well known is rc4 which is used in wp and ssl um but i've chosen a slightly different one which is called a5 one now you've probably never heard of this algorithm but
i can almost guarantee that everything i want to do in this room is using it because it's used to protect voice and sms data in mobile phones um this algorithm has actually been broken it's it's no longer considered secure um it was state of the art a few decades ago and now it's it's quite not um but it's an interesting one to look at as a stream side for example um how it sort of works is a bit like this diagram here so the um it's got a big state machine in the middle [Music] and
it consists of three registers uh this with each with each of a different size and what it does is each time you want a new bit out of your your random number generator it takes the the top bit out of each of these registers exhaust them together and that's the the output bit that's produced um once it's done that each of these registers is shifted to the to the left and depending on this byte here so it's got these uh what what these called clocking bits and so it compares each one of the bits in
the register and says okay we'll take the majority so if they're all zeros the majority is zero and any that match that majority bit are are clocked any that don't match or left so if you've got sort of say one zero one the the majority bit there is is one so any registers that match that the first and the third get clocked and move to the left one in order to generate a new byte onto the back here um it takes the bits that are called in blue exhales them together and produces a a new
bit on the end um so take a look at more detail how it actually works when it clocks um this is the uppermost register from the previous diagram you can see it's got some some sort of numbers in there and we clock it once and everything shifts left x all these bits and put it back on the end next cycle we do the same thing xor it put it back on the end and this generates a a really long sequence of different bytes and obviously you've got three of these so you get quite a large
sequence of random data coming out of it okay now um stream ciphers are quite useful but they do have a few things that you need to keep in mind when using them the first one is that keys must not be reused um because of the way that it combines the output of the cipher using xor it will always produce the same output bytes so if you produce if you encrypt two different messages with the same key somebody can actually use those two messages to start recovering parts of your output stream and therefore they can actually
decrypt your your messages to guard against this a lot of string ciphers include what's known as an initialization vector or iv which is combined in some way with the with the secret key and then you sort of send that id along with your message as part of it and then someone uses the same algorithm to combine that key with your iv and then that makes sure you're always using a different key for each message um wpep is actually vulnerable because of this although they use an iv in the initialization of the rc4 algorithm the iv
that they picked was too small which means that over some time if you're sending lots of wi-fi packets back and forth you'll eventually repeat not just the secret key but the the ivs will repeat that means that once once someone detects two messages using the same iv they can use those messages to decode the output stream and at that point they can they can sort of recover the the key for the network and connect your wi-fi network and sniff your traffic and things which is sort of why we've sort of phased out wp in in
wi-fi these days um it's kind of easy for an attacker to modify a message so let's say you're downloading a html page for website and you're encoding it with um a stream cipher anyone who can sort of guess that maybe there's a javascript file in the header of that can actually compute a um an xor with what they think might be in the message and what they really want in the message and explore that into the cipher text and that will actually replace it in the cipher text and it'll decode to what they want it
to rather than what was sent this means that when you're actually using the stream cipher you need to make sure you've got some sort of message authentication to prevent this tampering such as a message hash it's also it's not so much a security concern more of a practical one most stream ciphers require that you run them all the way through to the crypto message and you can't sort of arbitrarily seek into a stream cipher some of them have been designed to allow this but most do not um this means if you're say you've encrypted a
huge 50 gig database backup using a stream cipher and you need to recover one table worth of data from that backup and you know that it's 20 gig away in you're still going to have to decrypt the first 20 gig to get to that table data um you can't just sort of seek into it and decrypt just the bit you need um so that's sort of something to be aware of uh next thing we look at uh not uh is the block cipher the key difference between a a block and a stream cipher is that
um whereas the stream cipher produces basically pseudo random stream where you can use to encrypt a block cipher actually works on a block of your plain text directly and applies various different mathematical transformations to it now the the size of block works differently depending on the algorithm but it's usually much shorter than any message you might want to send um aes for example uses 128-bit blocks um older ciphers tend to use 64-bits um so that's that's obviously a lot shorter than any any message you're going to want to send so you need to break up
your message into blocks and encrypt each one separately and that's where the name comes from a block cipher so obviously the probably most famous one you've heard of is as and that's kind of like the probably the one you'll be using for a lot of your day-to-day encryption needs um aes was a result of a cryptography competition to find a replacement for an order an order encryption standard the data encryption standard and it was eventually won by a slight variant of the ryan gel algorithm now you might know if you've been using the m-crypt extension
that there are there is a different flavor of as that's just called rhine gel and if you use the wrong constants with encrypt you use that one instead which is not quite what you want a bit of a got you there fortunately it's been deprecated but if anyone's still using it something to look out for um so how aes works is it's got um so you start off with the message and you apply the aes algorithm to it um repeatedly for a few cycles um depending on the key length is depending on how many how
many actual cycles you you run through of as um each round of an as consists of four distinct phases which is substitute bytes shift rows mixed columns and then it adds a portion of the of the secret key to the data and then it repeats the loop again so each one of those rounds is applied every time you you sort of loop through it look in a bit more detail the substitute bytes um it works on a block size of 128 bits and it's effectively got a substitution cipher with a fixed key um now the
the key has actually been chosen to try and uh avoid a number of different algorithmic um and cryptographic techniques to provide defense against various various techniques that were used against the predecessor ds um so the the bits in the substitution table have been chosen specifically um and what happens basically is it just takes the the current byte in the state looks up um that byte in the substitution table and replaces that byte in your your current block with the byte from the substitution table um the next thing that happens is each each row in your
in your sort of block is is shifted using a bit shift operation um and then you used to like rotate the first bite around at the end so so does that looks like that diagram uh the third step is it's a bit of a complicated um mathematical operation uh but it actually does a multiplication over each column um and it it does like to sort of like mix up the the data in a different way um for the final round of the cipher this steps actually skipped over so it doesn't um doesn't occur in the
last round but it's every round other than that um and then finally it takes a bite from your key and exhales it with a byte in the block and then produces the final output um once you've gone through all those steps um a number of times i think it's 14 rounds for 128-bit key then you output a block of ciphertext um it's quite an efficient algorithm it's important in hardware and quite a lot of processes these days so it's very fast to compute and uh that's how it works now obviously with a block cipher you're
going to want to encrypt more than 128 bits at a time and so when we use a block cipher and a message that's longer than that we've got to split up into blocks however there are quite a lot of different ways in which you can utilize a block cipher and these are known as modes of operations so one mode is known as the electronic cookbook it's probably the one that you'd you'd sort of think up first is you literally encrypt a block of text and then the next one and the next one and just append
them like that so you've got a plain text you've got a key you do your block cipher encryption and you get output block you do have next block next block and you just start pen them believe it or not this is a really bad mode which you shouldn't use although block cipher produces a really random output and it's really difficult to reverse that um any block any piece of plain text that's the same same 128 bits will come out as a ciphertext that's exactly the same this actually gives you the same problems as with a
substitution cipher where someone can actually look at statistical patterns in your data obviously it's a bit more difficult but take a look at this penguin tux linux mascot um and that's tux encrypted using as in ecb mode you can quite clearly see that that tux is still there now if this was an image that maybe you you wanted not for people not to be able to see um have it having it like that probably wouldn't be your your desired um result so you should never really use ecb mode unless you just want to make cool
pop art images like this that's that's legitimate use i guess so a bit of an improvement um the first sort of attempt to fix that sort of problem was called cipher block chaining how that sort of works is you take an initialization vector similar to um to a stream cipher and you feed that in for your first block and you export with your plain text pass it through a block cipher encryption and then you get your ciphertext for your next block you pass that ciphertext back in excel with the plain text and and pass it
through and you do that all the way along this adds to sort of like a randomizer to your your plain text so even if you've got two blocks the same because this and this aren't going to be the same they're going to come out differently it's obviously a bit similar to a stream cipher in the fact that you need an initialization vector and you should not reuse those but that sort of half solves this the um the problems with ecb um another another mode that's being used is called ctr mode or counter mode this is
quite a cool mode in the fact that it turns a block cipher into a stream cipher with a few advantages so it completely different from the previous two modes that we've looked at as we no longer directly encrypt our plain text what we do is we start off with a a nonce uh or an initialization vector again and we have a counter which we start zero zero zero zero and we encrypt that instead and explore it with our plain text which produces cipher text then for the next block you increment the counter encrypt that and
explore it you keep on going until you've got enough bytes to encrypt your whole message now due to the random nature of the uh the output of a block cipher um it effectively turns this into a stream cipher now obviously with ctr mode you need to take the same precautions as with a stream cipher such as not reusing the um iv and things and ensuring that you've got a um sort of a message authentication to to make sure it's something tampered with um somewhere else that this gives you that stream cipher doesn't is you can
actually seek into your encrypted text because if you want to get block 557 all you do is take your nonce increment the counter to 557 pass it to the block cipher and you can decrypt into your into your encrypted data you can decrypt it and without having to decrypt everything up to that point um another mode which which sort of helps alleviate the issue that we had with um with ctr mode in the fact that it's a stream site without authentication is um glorious counter mode what this does is it combines the counter mode with
um an authentication tag which helps verify that message hasn't been tampered with um how that works is first of all you've got here you've basically got the account you've got the counter mode going on up here as well but you also have a separate part of your key which is auth data here which goes to a multiplication function and then you start xoring it with each of your ciphertext bytes and you sort of chain on lots of different multiplications of your ciphertext and then finally add on the length of your your um your ciphertext and
data and that produces you an authentication tag which when the person who you're sending the message to receives it they can verify that authentication tag before decrypting the message and make sure it hasn't been tampered with um that's sort of how that works so that's that's sort of covered off a lot of the symmetric ciphers um all those ciphers that we've looked at so far including even the historic ones use a single key for both encrypting and decrypting that leads to a really hard to solve problem which is key distribution imagine for a moment that
you're an undercover agent alice is undercover um and in dpn mma operations and needs to get a secret message back to bob at hq to to send mission reports and and tell bob what the the evil overlord is up to um and they obviously need to do so without the the uh the agents of the enemy mallory and eve eavesdropping and modifying the messages um this is where the asymmetric ciphers actually comes in because they use different keys for encryption and decryption um we don't need to sort of pre-share keys so i can just keep
on the encrypting message to you without having to worry about keeping my key secure and things like that and even if i send a message out and it's intercepted and the enemy capture it they won't be able to search my room and find the key that i use to encrypt it because it's a public key that can't decrypt again um so looking at public key cryptography um a quite a good good way of thinking about this is if you think about padlock so here's padlock okay now imagine that i i want to send a message
to someone in the back of the room or they want to send one to me what i could do is i could give a metal box a nice nice sturdy metal box and i could write a bit of a message on a piece of paper and i could give them my padlock okay and they could they could attach this padlock to the box locking it and they could pass them a pass the box back from the back of the room past all you people who we don't really trust back to me and none of you
unless you've got some bolt cutters and brute force are gonna be able to undo that padlock but as soon as it gets back to me i've got the i've got the private key i'm keeping over here i can unlock it and i can read the message that's kind of the same same idea that that public key cryptography uses well the first and oldest public key system is is rsa despite the huge amount of protein power improvements and things like that we've made since its invention um it's it's still quite a secure algorithm um it relies
on a sort of mathematical problem which is quite easy to compute in one direction but it's really difficult to sort of reverse and how it sort of so if you've just set this mathematical problem it's really difficult to reverse but if you happen to know a secret that was used when constructing the problem it's really easy to reverse again we have actually come up with better public key algorithms since but only by the virtue of the fact they're actually more efficient to implement in code and they use less cpu cycles to run um they're not
actually much greater in the level of security they provide it's just that you can use shorter keys with them so the mathematical problem that rsa is based on is exponential in modular arithmetic so the idea is that you can find three rather large numbers um e here d and a prime number n and such that when you take um e to take any number and raise it to the power of e and then raise it to the power of d it equals itself modulus your prime number n okay so it's sort of it's cyclic um
now if you've got a message that's been encoded like this and you only have e it is really really difficult to figure out what d is mathematically however if you already know d you you can actually reverse it so you can make this number n and e publicly available and you can you can pose this mathematical problem to the world saving the knowledge that it's really really hard for them to solve it um and so in order to actually use this for crypto system what you can do is you've got my my uh public key
which is enn and you can take the message that you want to send to me you can raise it to the power of e take its modulus by my prime number n and that becomes the cipher text you can send that to me say from the knowledge that only the person with d can do this mod multiplication here and reverse the process and turn it back into the plain text m um obviously a slight issue with this scheme is that the message m must be smaller than this this modulus to use um so usually what
you will do with rsa and similar public key algorithms is that you'll generate a random string that you use for a symmetric cipher like aes and then you'll use rsa or another public key algorithm to encrypt just the key part of the the symmetric cipher and then you'll send the ciphertext from the symmetric cipher the encrypted key to your intended recipient they can then use their private key to recover the the random key and then they can decrypt a message it's a two-stage process um and and that sort of solves the key distribution problem another
thing we need to solve is identification of um identity verification so although alice and bob can send messages to each other securely without eve being able to eavesdrop and recover those messages um how can they protect from the mischievous mallory who likes to tamper with messages and change them turns out you can actually use a similar thing with rsa um if i take a message say a hash of the ciphertext that i'm sending to you and i raise it to the power of my private key d i i can send that as a signature when
bob back in the office receives that message he can raise it to the power of my public key and if it has been signed using my private key it will return back to the value of the hash he can then hash the message check it matches and then he knows that it was sent by alice obviously if it doesn't match then something's gone wrong and he knows that the message has been tampered with or didn't come from alice in the first place again in practice that that message must be short so the signature is usually
a cryptographic hash of the message like shah256 or similar so i've covered a selection of of all the algorithms that are in use today that solve quite a lot of the problems in modern cryptography um so this last five minutes or so um we're going to look at how you'd actually go about implementing cryptography in in applications if that's what you need to do so the first bit of information bit of advice is don't um i hope that you've sort of got an idea from this talk just how how many things you have to sort
of keep an eye on and how difficult it is to actually do these things securely so if you've got a need to encrypt data don't try and do it yourself as much as possible um it's very easy to introduce vulnerabilities into applications through things like side channels um someone measuring how even if you take aas which is a secure algorithm and implement it if you um don't take care of things like how long it takes to encrypt the data someone might be able to retrieve information about your key or your plain text just by measuring
how long it actually takes to encrypt messages and things like that um so the best bit of advice is use an existing implementation okay everything i've gone over today is battle tested and and hardened and it's been poured over by cryptographers for at least a decade and there are well-known well-tested well-used implementations out there use one of those okay um most links distros for example allow you to encrypt hard drive partitions that's a good first starting point um all the major web servers have support for tls use ssl for your connections between between servers don't
try and manually implement in your application some sort of encrypting send over http decrypt on the other side just use https it's well tested it's it's gonna work okay another option um if you've got like two to remote data centers you could use a vpn between the two or maybe an ssh tunnel again all of these protocols and algorithms have been well tested there's lots of people going over the source code all the time the patch and the security vulnerabilities and things like that you're benefiting from the knowledge of people who who do this all
the time okay um if one of those situations doesn't fit your use case if you do actually need to implement some cryptography bring in an expert okay um bring in someone to audit your code and make sure you've not made any mistakes um it's it's not a cryptography isn't a skill that most developers are like really tip-top on so although you can probably implement it bring someone in an expert an outside consultancy and get them to just make sure you've done it right um it might seem expensive to bring someone in to do that but
it's nothing compared to the costs if you become sort of like if you get hacked and you become the next like ashley madison or sony that sort of sort of get all that data spewed all over the internet um especially with the gdpr coming in now that's that can be quite costly if if you make a mistake with this stuff so bring an expert it'll save you money and worry in the long run um obviously there's a php conference so what do you need what do you what should you do if you actually need to
encrypt and decrypt stuff in php um there's a lot of libraries out there i've i reviewed a lot of them um there's quite a few that default to using pop-up penguin mode and insecure other options and if you would just download it with composer and be like yep that's the library i'll use that you'll end up with a few problems like people being able to view your images or sort of like pop-up versions of them so top of my list of recommendations is a library that uh scott akuzuki has written which is halite highlights a
wrapper around libsodium which is a library that's been written by cryptographers to limit the amount of choices that developers are given you've got one implementation of things and that's a secure implementation and that's the sort of design goal of lib sodium and and highlight just provides a high level in place for that which is really straightforward it looks a little bit like this that would be how you would encrypt something using halite the wrapper around lip sodium it's that easy and that will be secure okay um so if you need to do something that's your
best option um lip sodium is in php 7.2 um you can install the extension peckle um for sort of versions before 7.2 if you're in an environment where you can't install it scott's also written a polyfill in php which you can actually include and it's got all the the same algorithms written in php um so you've got a fairly good range of options there if you if you want to use that if for some reason you you can't use lip sodium for whatever reason maybe you need uh compatibility with a legacy system or something like
that um diffuse php encryption is um another good library that's implemented in php um it uses a lot more of the the older style algorithms aes and um and an rsa for its cryptography um and it tries to implement those in a secure way and it'll fall back to to like open ssl if you've got that installed and things like that so it'll still be quite performant um so that's that's another option it's got a fairly similar ipi to halite so it's pretty much diffuse encrypt and diffuse decrypt so it's easy to get right um
so that's pretty much the end of close to the end of the talk so i've got a few links now for anybody who's interested in finding out more um there's there's a wealth of really interesting stuff on cryptography um one of the best ones if you're interested in the historic side of the talk is simon sings the code book he's got a few of the ones i went over and a few other ones and there's also some really interesting case studies on what happened when cryptography went wrong um it's a really good resource he's also
got a website where you can actually try out some of these um ciphers and you can encode and decode messages and do things like that um another good resource if you're interested in modern cryptography is bruce schneier's site um bruce nye is a well-known cryptographer i worked for a lot of leading corp companies on security and things like that um he also created blowfish and and the be crypt hash function which you're hopefully using for your passwords um so if you do feel like starting messing around creating your own algorithms he's got a self-study course
on his website which starts off showing you how to like with with increasing difficulty of algorithms you can have a go at breaking yourself um one of the things you know in order to become a good cryptographer you need to prove that you can break other algorithms that way any anything that you've done you you know that if you're really good at breaking algorithms and you can't break your algorithm it's pretty good but if you're just like someone who's just turned up and like yeah i've just invented this thing how how can we trust that
that's any good well if you've got a proven track record of being able to break cryptography and you say it's good you can probably trust that so if you're interested in sort of to get into that side his course is pretty good um the final link there is a library called php crypt do not use this in production um absolutely not um but it includes a pure php implementation of quite a lot of the interesting ciphers so if you actually have a look at the code implementation of some of these including the historic ones i
think it's got enigma in there and a few others um it's quite a good one just have a look at the source code and see how they they tick underneath um but yeah obviously none of those are suitable for anything other than sort of like your own personal interest in cryptography right um i hope everyone sort of learned something from this talk and found it interesting my my twitter handles give up already i've also got gitlab github with a few few projects and things like that on um interesting post on my blog if anyone's interested
i i broke an algorithm and someone implemented themselves one of my blog posts is actually showing why why you shouldn't actually do these kind of things and finally there is a joined in link if if you want to rate this talk let me know if it's any good