Okay hi everybody let's get started so um it's been a while since we uh came together in a lecture last week we had the holiday we had the midterm um so uh with that um where what have we been doing um we Finished the first half of the course um you know about two weeks ago where we talked uh we were talking about computability theory we have shifted into the second half talking about complexity theory um so get your mind uh back to that uh we um discussed the various different models um [Music] and and
and ways of measuring uh complexity on different models um At least in terms of the amount of time that's used and in the end we settled on the one tape turing machine which is the same model we had been working with in the first half of the course and argued that um though there the measures of complexity are going to differ somewhat from model to model they're not going to matter different by more than a polynomial amount and so Since the kinds of questions we're going to be asking uh are going to be um basically
whether problems are polynomial or not uh it's not going to really matter uh which model we pick among reasonable deterministic models and so we're gonna the one tape turing machine is a reasonable choice uh given that we defined time complexity classes the time t of n classes we defined the class p Which was invariant among all of those different model deterministic models in the sense that it didn't matter which model we choose we're going to end up with the same class p so that argues for its naturalness and we gave an example of this path
problem being in p and we kind of ended that lecture uh before the midterm with the discussion of this hamiltonian path problem so we're going to come back to that today Uh so today we're going to look at non-deterministic complexity uh define the classes non-deterministic time or end time uh talk about the class np the p versus np problem which one of the you know very famous unsolved problems in our field and uh look at dynamic programming one of the most basic algorithm polynomial time Algorithms and polynomial time reproducibility moving toward our discussion of np
completeness which we will begin um next lecture so with that uh let's move into today's uh content which is well just quick review we as i mentioned we defined the time complexity class this is a the time complexity class is a collection of Languages the languages that you can do um these are a collection of languages all of the languages that you can solve in a certain amount of a certain time bound within a certain amount of time so the time the time n squared for example is all of the the pr all of the
languages or all of the problems that you can solve in n Squared times we're identifying problems with languages here and the class p is the collection of all problems that you can solve or languages that you can solve in polynomial time so it's the union over all time n to the k so n squared and cubed and to the fifth power and so on um union out of all that all of those bounds um the the associated language languages that's the class p And um we gave an example this path problem we gave an algorithm
for path and then we introduced this hamiltonian path problem so if you remember instead of just asking given a graph whether you can get from s to t now we want to know can i get from s of t but visit every other node along the way So find a a path that goes through everything else and gets from s to t um and uh i should say it's also it's a simple path so you're only allowed to go through every node just once um and now um uh um the question is for this problem
is That can we solve that problem in polynomial time the you know can we somehow modify the algorithm for path to give us an algorithm that solves hamiltonian path in polynomial time of course we could solve hamiltonian path with an exponential algorithm by trying all possible paths um and that will give a correct algorithm but there are exponentially many different Paths and trying them all will not give a polynomial time algorithm so the interesting problem is can we solve that without doing that brute force searching through all possible paths um and that's a problem that
no one knows the answer to um despite lots of effort people have not Succeeded in finding an algorithm for that but um on the other hand we don't have any idea how to approve there is no such algorithm i mean it's conceivable that one could prove such a thing but we just don't know how to do it and so um that problem is an unsolved problem and i just want this isn't really a note to myself um what's kind of amazing and this is what We're going to be showing over the next few lectures that
there are there would there would be very surprising consequences if you could find a way to solve hamiltonian path in polynomial time because what that would immediately yield is a polynomial time way of say factoring large numbers or solving a large number of other problems that we don't don't know how to solve in Polynomial time so you know as we mentioned you know factoring um isn't problem that we only know at present time itself with an exponential algorithm um and it doesn't seem to have anything to do with the fact with the hamiltonian path problem
seemed very different but yet if you can solve hamiltonian path and polynomial time then you can factor numbers in polynomial time and so we'll See how to make that connection um that's what we're building toward with the next uh few lectures okay um so happy to take any comments on comments and questions on that or we'll just move on um if you have any questions on our little review well send questions along and we can Stop at the end of uh you know various slides to to to try to answer them or and of course
write to the tas um who can take your questions during during while i'm lecturing okay um so to start this off we're going to have to talk about uh non-deterministic complexity as a variation of deterministic complexity so first of all when we have a all of our all of the machines in this Part of the course and the languages everything is going to be decidable and all the machines are going to be deciders so what do we mean when we have a non-deterministic machine which a decider is a decider and that just simply means that
all of the branches it's not just the machine holds on every input but all of the branches hold on every input you know so the non-deterministic Machine is non-deterministic it has lots of possible branches they all have to halt all of them on every input that's what makes a non-deterministic machine a decider and you can convert a non-deterministic decider into a deterministic decider but the question is how much time would that introduce how much extra time is that going to cost and the only way that people know at the present time For that conversion would
be to do an exponential increase um basically to try all possible branches and that's of course very slow so um let's first let's understand what we mean by the time used by a non-deterministic machine and what we mean by the time used is we're looking at the each individual branch individually Separately so a non-deterministic machine will say runs in a certain amount of time if all of the branches halt within that amount of time so you you know what we do we we do not mean that the total amount of usage the total amount of
effort by adding up all the branches is at most t of n it's just that each branch individually uses at most the event that's just going To be our definition and it's going to turn out to be uh the right way to look at this to get something useful um so um if we have um so now we're going to define the analogous uh complexity class associated to uh non-deterministic computation which we'll call non-deterministic time So non-deterministic time t of n is the set of all languages that you can do with a non-deterministic machine that
runs an order t of n time um just think back to the definition we had for deterministic uh complexity the time class or sometimes people call it d time to emphasize the difference but let's just say we're calling it in this course time versus end time so time t of n is all of the languages that you can do With the one tape turing machine that's deterministic um but this here is a non-deterministic turing machine for non-deterministic time um so the picture that uh uh that is good to have in your head here um would
be uh if you think of non-determinism in terms of a computation tree thinking of all the different branches of the non-determinism All of those branches have to halt and they have to halt within the time bound so imagine here this is t of n time all of the branches have to hold within t of n steps for this a non-deterministic turning machine to be running in t of n time and to be doing a language in the end time t of n class um and by analogy with what we did before The class np is
the um uh collection of all languages that you can do non-deterministically in polynomial time um so it's the union over all the end time classes uh where we the bound is polynomial Um okay so a lot of this should look very familiar but we've just added a bunch of non-deterministics and a bunch of n's in in place but the definitions are very similar um and one of the motivations we had for looking at p the class p was that it did not depend um on the choice of model as long as the Model was deterministic
and reasonable and this is also the class np is also going to not depend on the choice of model as long as it's a reasonable non-deterministic model so it's again a very natural class to look at from a mathematical standpoint and it also captures something uh interesting kind of different from a practical Standpoint which we're going to talk about over the next couple of live slides which is that it captures the problems where you can easily verify when you're a member of the language okay so we'll talk about that but you know if you take
for example the hamiltonian path problem when you find a member of the language so that is a graph that does have a hamiltonian path from s to t You can easily verify that's true by simply exhibiting the path not all problems can be solved can be verified in that way but the problems that are in np have that special feature that you when you have a member of the language there's a way to verify that you're a member so we're going to talk about that um because that's really the key to understanding np This notion
of verification okay so let me go there was a good question here let me just see if i want to answer that um um yeah mean this is a little bit of a long longer question that i i i want to fully respond to but you mean the the Well let's turn to my next slide which maybe sort of bring that out anyway um actually it's a couple of slides from now but i'll get to that point um so let's look at hamiltonian the hand path problem and what i'm what i'm going to show is
the hamiltonian path problem is in np and i'm going to walk you through this one kind of slowly um so the hamiltonian path problem remember we don't know if it's in p But it is in np so it's in one of these you can solve hamiltonian path in polynomial time if you're a non-deterministic machine um why is that well it's because of the non-determinism because of the parallelism of non-determinism um which allows you to kind of check all of the paths on different branches So let me first describe how the algorithm how i would just
how i would write down the algorithm and then we'll kind of try to unpack that and understand how that actually looks in terms of the turing machine or the turing machine's computation so first of all so taking the hamiltonian path problem you know you're given an input now which is a graph And the node's s and t where i'm trying to figure out is the hamiltonian path again which visits all the nodes which takes you from s to t and we're trying to make now a non-deterministic machine which is going to accept all such inputs
which have a path so the way this non-deterministic machine is going to work Is it's basically going to use its non-determinism to try all possible paths on the different branches and the way i'll specify that is to say non-deterministically we're going to write down a candidate path which is just going to be a sequence of nodes sequence of m nodes where we'll say that's the total number of nodes of the graph remember a hamiltonian path Because it visits every node is going to be a path with exactly m nodes in it so i'm going to
write down a sequence of nodes as a candidate path and i'm not deterministically going to choose every possible sequence in this way um if you like to think of the guessing metaphor for non-determine non-determinism you can think of the Non-deterministic machine as guessing the right path which is going to be the hamiltonian path from s to t but i think from for this discussion it might be more helpful to think about all of the different branches of the non-determinism um because that's perhaps more useful when we're thinking about it in terms of the time you
know i think you'll get used to thinking about it you should be used to Thinking about it many in many of the different ways but maybe the computation tree of all branches might be the more helpful one here so now as after we write down a sequence a candidate path sequence of nodes now i have to check that this really is a path and the way i'm going to do that is to say well now if i have just a sequence of nodes written down what does it mean for it to be a hamiltonian path
from s To t well it better start with s and end with t first of all and we have to make sure that every step of the way is actually an edge so each pair v i to v i plus 1 has to be an edge in the graph otherwise that sequence of nodes is not going to be a legitimate hamiltonian Path from s of t and you could it has to be a simple pass you can't be repeating notes these four conditions together will guarantee that we have a hamiltonian path and once we have
written down a candidate sequence we can just check that that sequence actually works um there's no at this second stage of the algorithm non-determinism is necessary this is Going to be a deterministic phase but the stage one of the algorithm is going to be a non-deterministic phase where it's writing down all possible paths now i'm going to try to unpack that for you so you can actually visualize how the machine is doing this um and then of course you know so you're gonna on each branch of the non-determinism you're going to Check to see whether
the conditions have been satisfied and on that branch if the conditions were not satisfied that branch is going to reject of course one of the other branches might yet accept so um that's how non-determinism works okay so i'd like to visualize this um uh um I'd like to visualize this as a uh the tree of uh of the different branches of the computation of m on its input so here is the here is our non-deterministic turing machine um uh which this one um and you provided with the input uh You know g s and t
and how does the machine actually working so when i say non-deterministically write down a sequence of m nodes you know you know this is getting into a little bit more detail detail than i would normally think about it uh because we try to tend to think about things at a higher level but just for just to get us started i Think it's good to think about this with a bit more detail so let's think of the nodes the m nodes as being numbered having labels numbered one through m and i'm going to think about them
being labeled by their binary sequences you know we're going to write down those nodes that's how the you know the machine is going to have to operate on those numbers for the nodes we'll think about them as being written in binary And now as the machine is going to be writing down let's say the node v1 so it's not deterministically picking the first node of the sequence what does that actually mean in terms of the um the step-by-step processing of the turing machine m well it's going to be um guessing via a sequence of non-deterministic
moves The bits that represent the number of the node v for v1 you know for example v1 might be the node number five in the graph um of course you know non-deterministically the machine is on different branches picking all different possible can choices for v1 those are going to be the different Branches here but you know one of the branches might be and what i'm what i'm really representing here these are uh i probably could have written this down on the slide here in in in tiny font but these are like the zero one choices
that's why it's sort of a binary tree here for the writing down the bits Of the uh of a v1 so here maybe this could be one zero one representing uh the number five which might be the very first node that i'm writing down in my sequence some other branch is going to write down node number six some other branch is going to write down node number two because non-deterministically we're making all possible choices for v1 that's what i'm trying to show in this Little part of the computation of m on this input so then
after it's uh finished writing down the description of the node uh for v1 um its choice for v1 it goes down to make choose what v2 is again non-deterministically so there's going to be more branches you know um for each Possible choice of v2 and so on node after node um then it's going to finally get to the last node vm is going to write down lots of choices for vm and at this point here we have completed uh the first stage of the algorithm now there's some huge tree of all of the possible choices
for the v's that have shown up at this point okay And uh now we're going to move into the second phase so following this there's going to be um here uh a bunch of uh deterministic steps of the machine so no more no no more branching no more branching is needed uh because here we've written down at this at this point we've we've reached a point in each of These um at each each of those locations where we've chosen one of the candidates one of the possible branches or one of the possible paths through the
machine um one of the possible paths look through the graph i'm sorry so we um you know here we're guessing uh potential paths in the graph and now we're going to check that we actually have picked a path that's a Hamiltonian path from s to p as to t okay so each each one of these branches is now going to end up accepting or rejecting and the whole overall computation is going to accept if at least one of them ended up accepting which means you actually found a hamiltonian path okay so i don't know if
that's helpful to you or not but um that is uh so we can just you know if There's any questions on this let's see um question on you know is there something um uh uh you know trying to draw a connection here between this and and computation histories um i mean there is a pattern here that does come up often Where you you you want to check some things you know starts right ends right and that all of the intermediates are right so there is i think there is a um there is some deeper connection
here probably too hard to explain but um that does have something to do with this hamiltonian path problem um why are we using binary representation well we're going to talk About um you know we the algorithm would have worked equally well if we used um base three or base five or base you know 20 um as a way of writing down our labels for the nodes um but in a sense it doesn't matter um um the alphabet has to be finite so that's that's true i mean that's why you it's not just in a single
step of The turing machine that you would pick the the node uh the true the choice for v1 you really have to go through a sequence of steps because each of the branches of the machine only has a fixed number of choices so you can't in a single step of the turing machine pick all the different possibilities for v1 it has to go through a sequence Um okay um now let me do a second example the the problem for of composites so the the the language of all composites are all of the non-primes written as
binary numbers again so we'll talk about the base and the representation in a second but just imagine you know he you know these are All of the numbers that are not prime uh and that language is easily seen to be a member of np uh here is again the algorithm for that um given x um we want to accept x if it's not a prime number so it has some non-trivial factor so first the way the non-deterministic machine is going to work is it's going to guess that factor so it's not Deterministically it's going to
try every possible factor what we're wrong y is going to be a number between 1 and x but not including one we you know that we have to be an interesting factor so not including 1 in the number itself so something strictly in between and we're going to then after we've non-deterministically chosen y then we're going to test to see if y is really a factor We'll so we'll see if it y divides evenly into x with a remainder of zero you know if that branch successfully picked the right y it's going to accept and
some other branch might have picked the wrong y will not and if x is really a composite number some branch will find the factor um now the base doesn't matter could have used base 10 Because you can convert from base to another um so this is really in terms of our representation of the of the number but i do want to make one point here that changing we don't want to write the number in unary um you know just as writing the number k as a sequence of k ones um that's not really a base
um that's just an exponential representation for the Number and that changes the game because if you make the input exponentially larger um then uh it's going to change whether the algorithm relative to that exponentially larger input is polynomial or not um so an algorithm that might have been exponential Originally when the numbers written binary might become polynomial if the number's written in unary um and i do want to mention as a side note that the composites language uh or primes for that matter doesn't you know both are in p um but we won't cover that
so Uh whereas the hamiltonian path problem is not known whether it's in p the primes and composites problem are uh np um so that was known there was actually a very big result in the field uh solved um uh by folks in at one of the uh indian institutes of technology uh back about uh you know almost 20 years ago now um Okay um [Music] so let's turn here to trying to get an intuitive feeling for pnnp and we'll talk we'll return now to this notion of np corresponding to easy verifiability so np are the
languages where you can easily verify membership quickly and i'll try to explain what that means in contrast p are the languages where you can test membership quickly uh by Quickly i mean i'm using polynomial time um that's going to be basically you know for us that's what quickly means in this course so in the case of the hamiltonian path problem the way you verify the membership is you give the path in the case of the composites the way You verify the membership is you give the factor um in those two cases and in general um
when we have a problem that's in np we think of this verification as having a giving we give it a special name called a certificate or sometimes a short certificate to emphasize the polynomiality of the certificate It's like a way of proving that you're a member of the language um in the case of composites the proof is the factor in the case of ham path the proof is the is the path the hamiltonian path um contrast that for example if you had a prime number um you know proving a number is composite is easy Because
you just exhibit the factor how would you prove that a number is prime um what's what's the short certificate of um proving that some number is it has no factor that's not so obvious in fact there are ways of doing it which i'm not going to get into in the case of prime testing of numbers prime but and and now It's even so known to be in p so that's even better but there's no obvious way of proving that a number is prime with a short certificate um and so this concept of having being able
to verify when you're a member of the language that's key to understanding np that's the intuition you need to develop And hopefully take away from today's lecture or at least you know by thinking about it and reading the book and so on um so if you compare these two classes p and np uh p first of all is going to be a subset of np both in terms of the way we defined it because deterministic machines are a special case of non-deterministic machines But also if you want to think about testing membership if you can
test membership easily then you can certainly verify it in terms of the certificate you don't even need the certificate is irrelevant at that point because whatever the certificate is you can still test yourself whether the input is in the is in the language or not um the big question as i mentioned is Whether these two classes are the same so does uh being able to verify membership quickly say with one of these certificates allow you to dispense with the certificate not even need a certificate and just test it for yourself whether you're in the language
um and do that you know in polynomial time uh that that's the question For a problem like hamiltonian path uh do you need to search for the answer um if you're doing it deterministically or can you somehow uh not me you know avoid that and just come up with the answer directly uh with it with a with a polynomial time solution nobody knows the answer to that uh and goes back at this point quite a long time it's almost 60 years now um uh that That problem has been around 60 years oh that would be
50 years no 50 years almost 50 years um so um the uh most people believe that p is different from np in other words that there are problems in p that are in np would turn out in p um the candidate would be the hamiltonian path problem uh But it seems to be very hard to prove that um and part of the reason is i mean how do you prove that a problem like hamiltonian path does not have a polynomial time algorithm um it's very tricky to do that because the class of polynomial time algorithms
is a very rich class polynomial time algorithms are very powerful and to try to prove there's no clever Way of solving the hamiltonian path problems just seems to be beyond our present-day mathematics i believe some days somebody's going to solve it but all right so far no one has succeeded um so what i thought we would do is i think i have a check in here uh yes and then we'll stop for a break so let's look at the the complementary problem hand path complement Um so that's you're in the language now if you don't
have a path um so is that complementary problem in np okay so for that to be the case we would need to have short certificates of when a graph does not have um a hamiltonian path so I leave it to you so there are three choices um okay i'm going to stop here so make sure you get your participation credit here um end the polling now interesting still the majority is wrong well i know wrong we don't know uh you know um uh i i think the only fair answer to This question is is c
um because we don't know uh whether or not we can give short certificates for a graph not to have a hamiltonian path if if p equaled np then you can test in polynomial time whether a graph has a hamiltonian path and then the computation itself would be a certificate whether it has a path or whether it doesn't have a path because It would be something that you can check easily um so since we don't know uh for sure that p is different from np um you know p could be equal to np then it's possible
that um we can we could give a short certificate namely the computation of the polynomial algorithm So the only really um reasonable answer to this question is that we don't know um so uh just uh ponder that i think the the the thing that um you those of you who answered yes however um need to go back and and i ain't put this here explicitly because i know this is a th This is a confusion for well i can see for quite a few of you um um when we have non-deterministic computation and you have
a non-deterministic machine you can't sim simply invert the answer and get back a non-deterministic machine non-determinism does not work that way uh if you remember you know the non-deter the complement of A push-down automaton um is not a push-down automaton if you have a non-deterministic machine and you invert all of the responses on each of the branches um it's not going to be recognizing or deciding the complementary language so um because well i i think that this is something you you if you answered yes You need to go back and make sure you understand why
yes um is is not a is not a reasonable answer to this question uh because that's not how non-determinism works so you have a um a not complete understanding of non-determinism and that's going to be really important for us going forward so i i really urge you to figure out and Understand why yes is not a is not a good answer to this uh to this check-in okay so i think we will um we can talk about that more over the over the break um and so uh we'll we'll return here in um in five
minutes somebody's asking about in you know infinite uh equal infinite sequences be generated by the machine um I you know when we're talking about um you know especially in the complexity section of the course all of the computations are going to be bounded in time so we're not going to be thinking about sort of infinite runs of the machine that's not going to be relevant for us so um let's not think about that how does the turing machine perform division um How does a turning machine perform division well uh how do you perform division uh
the long division is a is an operation that can run in you know the the long division procedure that you you learn in a grade school can you can implement that on a turing machine um so yes a turing machine can definitely Perform do long division or division in one integer by another in in polynomial time okay another question can we generally say try dividing y by x or do we have to enumerate a string of length y and cross off every no i think that's the same question you know i mean how would you
know if you have numbers written in binary how would you Do the division you're not going to uh use long division um you know anything such as the the the thing that's proposed here uh by by the the questioner is going to be an exponential algorithm so don't do it that way um uh why does primes in p not um composites in p not implied primes in p it does Imply if if composites are in p then primes is in p as well because you can just you know when you have a deterministic machine you
can flip the answer when you have a non-deterministic machine you may not be able to flip the answer so the deterministic machine just having a single branch um you can make another deterministic machine that runs in the same amount of Time that does the complementary language um because um for deterministic machines um just like for deciders uh in the in the computability section you can just invert the answer uh there is an analogy here between p and decidability and np and recognizability it's not an Airtight analogy here but there is some relationship there um what
are the input like somebody's asking me the implications of p equal to np lots of implications um too long to enumerate now but for example uh you would be able to break basically all cryptosystems um that i'm aware of if p equaled np so we would have a lot of consequences Um so he's asking so composites primality and compositeness testing is solvable in polynomial time but factoring interestingly enough is not known to be solvable in polynomial time uh so that i mean so um we may talk about this a little bit toward the end of
the term if we have time but The algorithms for testing whether a number is prime or composite in polynomial time do not operate by looking for factors they operate in an entirely different way basically by showing that a number is prime or composite by looking at certain properties of that number but without testing whether it has uh testing but without finding a factor um Uh okay another question here about asking um the uh when we talk about encodings do we have to say how we encode numbers values no we don't have to we usually don't
get have to get into spelling out encodings as long as they're reasonable encodings so you don't have to Usually we're going to be talking about things at a higher enough level that the specific encodings are not going to matter um okay let's let's return to our uh the rest of our lecture when we talk about say this p versus np problem and How do you show that uh you know a problem might not be solvable in p like the hamiltonian path problem um you know many people you know who are not uh practitioners in the
field um you know are know about the p versus no know About the p versus np problems i i've over the years i've got many many emails and and and physical letters from from people um about that uh since people you know since uh since i've spent some time thinking i'm known as having spent some time thinking about it and um people claim to solve the problem uh solve p is np p the p versus np problem by basically saying you know Problems like hamiltonian path um uh you know or these or other similar problems
basically there's clearly no way to solve them without searching uh through a lot of possibilities and then they go through a big long analysis showing that there are exponentially many uh possibilities um the Um the uh you know and a lot of the proofs that claim to solve p versus np they all look like that you only you only have to look some somewhere in that paper there's going to be a statement uh along the lines to solve this problem clearly you have to do it this way Um and that's the flaw in the reasoning
because just like for the factoring problem just like for the uh compositeness testing problem you don't necessarily have to solve it use by by searching for factors there might be some other way to do it you might be able to solve a hamiltonian path problem without searching for hamiltonian paths there might be some Other process that you can use which would uh give you the answer um so i the class of polynomial time algorithms is very rich can do many many things and i and i i wanted to present to you um one of the
most important polynomial time algorithms in a sense it's kind of the you can make a certain argument that this is sort of the the the the in a sense the most fundamental Polynomial time algorithm some people might argue with me on that but um and that's a process called dynamic programming which i'm sure some of you have seen already in your algorithms classes and some of you may not have but um since you have a homework problem on it um i i want to spend a little time describing it to you and that's useful for
solving this acfg Problem um which you may remember from the first half of the course involving testing uh if a grammar generates a string okay so remember this acfg problem you're given a grammar context free grammar and a string and i want to know is it in the language of the grammar okay so that's going to turn out to be solvable in polynomial time But only with a kind of a clever algorithm remember it's decidable we decided it by converting by making sure that you are converting the grammar into chance chomsky normal form then all
the derivations have a certain length you just try all the possible derivations of that length and you set except if any of those derivations generate w okay Uh you may remember that um from the first half of the course um that immediately gives an np type algorithm for this language because basically you non-deterministically instead of trying the most sequentially all of these derivations you try them in parallel non-deterministically so non-deterministically you pick some Derivation of that length and you accept it if it generates the input this fits this is classically fits our model of of
uh of np you can think of it as guessing the derivation and checking that works or in parallel writing down all possible Derivations um but this acfg problem is a classically uh an np problem uh probably an in in np um uh and if you just imagine that uh so and and then what's going to be this the certificate if you found an input that's in the language um that's generated by the grammar the certificate Is the derivation so if you look at it that way you might think well that's the best you can do
this is going to be an np pro problem in np and there's not going to be a way of avoiding searching for the derivation but that's not true there is a way of avoiding searching for the derivation you can kind of build up the derivation um using dynamic program and so that's what i wanted to kind of just describe for you um how that works also partly Because it's a homework problem and i think dynamic programming isn't is a very important algorithm so um before we um uh describe what dynamic programming is which is very
simple by the way um let's try to work up to it by making an attempt to solve this problem just using ordinary recursion Um so how would we solve the acfg problem so you're given a grammar you're given an input let's assume the grammar's in conjunct in chomsky normal form so that's going to be useful so it's a chomsky normal form grammar and we want to see how to test if you can generate w um And it's going to be recursive algorithm recursive algorithm is going to actually sound something slightly more general i'm going to
give you the grammar i'm going to give you the string and i'm also going to allow you to start at some other variable variable besides the start variable so i'm going to give you a some variable r and i know i want to know can i generate w starting at r Okay so that's my slightly more general problem which is going to be useful in the recursion okay so the input now to this this algorithm is the grammar the input and the starting variable and now how is the algorithm going to work it's going to
try to test can i get to you know is there some derivation pictured here as the parse tree for w starting at r That's what the algorithm is trying to answer can i get w from r the way it's going to do that is it's going to try to divide w into the two strings in all possible ways which sounds like it might be exponential but it isn't There's only a polynomial number of ways to divide the string into two substrings um just order n just depending where you make that cut so there's you know
some uh so that's not too bad there's a polynomial there's only n ways of making that division and also i'm going to try every possible rule that comes from r um that generates two uh um That generates two variables so these are what's allowed in chomsky normal form r goes to st okay so i'm gonna for each possible way of cutting w into x and x y and for each possible rule r goes to st i'm gonna see recursively can i get uh from s to x and from t to y so i'm going to
use my recursion now now that i have smaller uh Um strings instead of w um i can apply my recursion and try to answer it that way and this algorithm will work so if they if i found a way to cut w into xy and i found a rule r goes to st such that s generates x and t generates y then i'm good i know i can generate uh w from r okay and if there's no way of cutting w Up um to satisfy that or uh you know uh if i can't find any
way to do to divide w into x y and the rule r goes to x t which makes this work if all possible ways fail then you can't get from r to w okay um and then you can decide the original acfg problem now by starting from the start variable instead of just some any Old r you plug in the start variable far okay so if this algorithm works and it can be used to solve ac of g but the question is is a polynomial and it's not because every time you're doing the recursion you're
essentially adding another factor of n because here as we pointed out this is a Factor of n but that's happening every time you're doing a recursive level and you can imagine you know i'm just doing a very crude analysis here depending upon how you divide divide w up but roughly speaking it's going to get divided in half each time so there's going to be log n levels and so that means you're going to be multiplying n by itself log n times or get give you an n to the log n algorithm that's not polynomial because
Polynomials enter the constant for some fixed constant n to the log n is not going to be polynomial so this is not a polynomial algorithm so instead you're going to have to do something just a little bit more clever it's going to be the same basic idea but just relying on one little observation and the little observation is that when This non-polynomial implementation that i just described is actually pretty pretty dumb because it's doing a lot of recomputation of things that it's already solved why is that because if you look at the number of possible
different sub-problems here once i fix once i give you g and and i give you w um How many different sub-problems uh g s and t are there um well the number of sub the number of uh of strings here all of those strings are going to be substrings of w there's only roughly n squared substrings of w i'm always going to be generating in a sub problem here some substring of w from some starting variable um in the Grammar so because there aren't that many different substrings and not that many different starting variables the
number the total number of possible problems that this algorithm is going to be called on to solve is going to be in total a polynomial number of different sub problems there aren't very many of them it's only something like border n squared so if the algorithm is running for Exponentially long long time it's solving the same sub problem over and over again that's dumb why don't you just remember when you solve the sub problem so you don't solve it again doing that enhanced recursion where you remember the problems you've already solved it's called dynamic programming
i don't know why it has such a A confusing name like that actually it's called by several different things but anyway that's known as dynamic programming it's a recursion plus the memory um and here is just repeating myself um there are not very many different substrings um so every time you're going to be in your recursion somewhere you're going to be working with a substring so there's not that many different sub problems that You can possibly solve and just remember when you solve the sub problem and not solve it again so let me just show
you that algorithm again here with them with a little modification so first of all let me give you this is the same algorithm from the previous slide i'm just repeating here without all the other stuff so we can just look at it directly right dividing It into x and y for each possible rule and just and then recurse now i'm going to add a little step 0 beforehand um which says if i have g w and r let me just check first with i've already solved that one before so i have to keep track of
the ones i've already solved That's not too bad because there's only n squared order n squared possible different ones that i could be called on to solve so i'm just going to have a little table where i'm going to remember those and then every time i get a new one i get i get one to solve i'll check is it on that table and what's the answer so i won't have to rerun those so now the total i'm going to be basically pruning that tree Um so that it has only a polynomial number of leaves
and so the total size of that tree now is going to be polynomial um uh and so that's going to yield a polynomial running time this by the way i only learned this myself i'm sure you guys all know this who have taken this as in the algorithms of course has a special name called memoization not memorization sort of this came from the same root i think but Memoization which is somehow remembering the results of a computation so you don't have to repeat it um so the total number of calls is going to be most
n squared to this algorithm uh because you're never going to be uh redoing a work that you've done already so and you know and when you actually have to go through it the the running time um uh Um you know so it's it's the the total amount of time that you're going to need is going to be polynomial altogether um so let's here i don't remember what my check-in is on this um oh yeah so i mean this is somehow related and feel free to ask questions too while you're thinking about uh this um This
check-in but the check-in says here we've solved the acdfg problem in polynomial time does that tell us that every context-free language itself is also solvable in polynomial time so just mull that over and please give me an answer to it i hope you do better on this check and then you did on the last one um but anyway why don't you go ahead and think about that And i can take some questions in the meantime someone is asking here um actually i'm getting several questions on this why isn't it order n cubed or something greater
than order n squared because of the variables the variables are not don't don't depend on n you know when you're given um oh well Actually that's not true no you you're you are right um uh because the grammar is part of the input so that you might have as many as n different variables in the given grammar so you you are right there is potentially um you know the grammar might be half the size of the input And the input to the grammar w might be half the size of the input so i didn't think
about that but you're correct there are uh potentially uh different numbers of variables in different grammars so you have to add an extra factor which would be at most the size of the input because that's as many variables as you could possibly have so it really should be i think um order n cubed um To take that into account as well plus all of the work that needs to happen in terms of dividing things up you know on a one tape turing machine there's going to be some extra work uh just to carry out some
of these individual steps um because with a single tape things are sometimes a little awkward i think the total running time is going to end up being something like enter the fourth or into the fifth on a one tape turing machine But but that's a good point so somebody's saying how can we be storing n squared strings in finite time i'm not saying what why finite time we have polynomial time every every state stage of this algorithm is allowed to run for polynomially many steps as long as it's clearly polynomial we can just write that
down as a single stage um part two should say oh There's a typo here so use d thank you that's that is a typo um uh i'm afraid if i change it on my original slide here things will break in some horrible way let's just see that i completely wrecked my slide no that's good yeah thank you good point um uh Oops okay how's how's our check-in doing okay i think you're just about all done um oh that's still spent a lot of time on this and polling um as you may remember from the first
half of the course so the answer is is a indeed um remember that we showed acfg is in is decidable and therefore each context free language itself is decidable just Because you can then plug you can plug in the specific grammar into the acfg problem the very same reasoning works here um you can for if if you have a context-free language it has a grammar you can plug that grammar into the acfg problem and then that's polynomial time you'll get you're going to get a polynomial time algorithm for that uh for that language um so
good to review that um It's it's the same thing from same argument we used before um okay also i i i want to spend a lot of time on this there are there's another way of looking at dynamic programming we'll talk about this again um maybe in a lecture probably next lecture just because i know you have a homework problem on it and um if if you've seen dynamic Programming before is going to be easy if you haven't seen it before it's going to be i think probably a little challenging but another way of looking
at dynamic programming is the so-called bottom-up version of dynamic programming and what that would mean is you solve all of the sub-problems first before you solve all the smaller sub-problems before you solve the larger sub-problems um let me Just you know i i it's here on the slide i'm not sure i want to talk it through but basically um you solve this the sub problems here where um the strings are start with strings of length one and then from that you build up to sub problems where the substrings are of Length two and then three
and so on and each of those only relies on the smaller previously solved sub problems so you can kind of in a systematic way solve all the larger and larger subproblems uh you know for larger and larger substrings and that gives a kind of a different perspective on dynamic programming and Kind of for different problems sometimes it's better to think about either the sort of top-down recursion-based process or the sort of the bottom-up uh process that i'm describing here um they're really completely equivalent um but uh you know the version that's described for this particular
algorithm which appears in the textbook is actually the bottom-up algorithm um so you shouldn't be Confused if you see something there which looks somewhat different and that's often you see you basically solve all possible sub problems filling up basically filling out a table um let me not say it anymore say anything more about that here since we're running a little short on time um but um there are really two perspectives on dynamic programming Okay so moving on from there um let's let's shift gears uh leave context free languages and uh dynamic programming behind um and
start moving toward understanding um p and np and for that we will um introduce a new problem called the satisfiability problem and that's one we're going to spend a lot of time on So important to if you tuned out a little bit during the dynamic programming discussion time to get back on board so the satisfiability problem is going to be a computational problem um uh that we're going to be working on and it has to do with boolean formula so these are like expressions like Arithmetical formula like x plus y times z but instead of
using um numerical variables we're going to be using boolean variables that take on boolean values true false or sometimes represented by one and zero and the operators that we're going to be using are going to be the and or and negation operations And or not okay um i'm going to say such a very such a formula such a boolean formula we're going to call it satisfiable we'll do an example in a second if that formula value evaluates to true um if you assign make some assignment the values to its variables so just like arithmetic their
formula will have some value if you plug in Values for the uh boolean formula is going to have some value if you plug in boolean values for its variables and i want to know is there some way to plug in values which makes the whole thing evaluate the true the the formula itself is going to evaluate to some either true or false and i want it to evaluate to true um so uh here is our example that let's take The formula fee which is x or y and x complement you know or not x or
not y so the notation x with a bar over it x x complement is just x bar not x is just the way if you're familiar with the other notation Uh for the the not operation we just inverts ones and zeros um we we would write it we're going to write it with a bar instead of the negation symbol so i i'm assuming that you've all seen boolean algebra boolean arithmetic before you know where you know the and operation is only true if both inputs are true so these are going to be binary and operations
in binary or operations um or is going to be true if either Input is true and not as true if it's single input is false so it just flips the answer okay so here i want to know for this formula for this boolean formula here is it satisfiable is there some way to assign values to the variables to make this formula evaluate to true so for example let's just try things let's make x and y both true so x is true and y is true So x or y well that's good that's true but now
we have to do an and so we need both sides to be true so now now we have x complement well we said we said x is true so x complement is false y complement is false false or false is false so now we have a true and false that's going to be false we did not find a satisfying assignment but maybe there's another one and in fact there is if you make x true and y false Then both of these parts will be evaluate to true and then you'll have true and true so we
found a satisfying assignment to this formula it is in fact uh satisfiable so if you say x is 1 and y 0 yes this is satisfiable now the problem of testing for a boolean formula if it is satisfiable is going to be the sat language it's a set it's a collection of satisfiable boolean formula And testing whether you're in sat or not is going to be a an important computational problem and there was an amazing theorem which really got this whole subject going uh discovered independently by uh steve cook in north america and lane 11
in the in The former soviet union almost exactly at the same time which made a connection between this one problem and all of the problems in np so by solving this one satisfiability problem in polynomial time you you there it give it it allows you to solve all of the problems in np in polynomial time so if you could solve this problem sat In in in p then hamiltonian path is also solvable in p it's kind of you know step back and think about that it's kind of amazing um [Music] and the method that we're
going to introduce is called polynomial time reducibility and let's do a quick checking on this this should be an easy one um i want to just think about is sat the Sat problem that we just described here is that in np three seconds you all there okay ending polling um yeah so the a hopefully you're getting the intuition for np uh that these are the problems to be an np means that when you're a member of the Language there's a short proof or a short certificate of membership in this case the short certificate that the
formula is satisfiable is the assignment which makes it true also called the satisfying assignment so yes this is an np language language that's in np um uh so there are a lot of things that we Don't know in this subject but this isn't one of them we do know that sat is an np so let's talk about our method um for showing this uh remarkable fact that if you can solve pollen if you can solve if you can solve sat in polynomial time then all of np is solvable in polynomial time and it uses this
notion of polynomial time reducibility which is just like mapping reducibility that i hope you've all grown to know and love In the first half of the course but now the reduction has to operate in polynomial time so it's the same picture that we had before mapping a to b transforming a questions to b questions but now the transformation has to operate quickly and we get that if a is polynomial time reducible to b and b is polynomial time Then a is also polynomial time same pattern as before if a is reducible to b and b
is easy then a is easy um and here is kind of the essence of the idea or at least the outline in a sense the idea of of this cook and levin theorem that if satisfiability is in p then everything in np can be done in p which is because we will show that all problems in np are polynomial Time reducible to set that's the amazing fact so therefore if you can bring sat down into p by using this reduction it brings everything else along with it because everything is reducible to the set so we
just have to show how to do that and there is an analogy that we had in the first half of the course in one of our homework problems if you may remember We showed that atm has the very special property that all touring recognizable languages are mapping reducible to it i think that was problem 2 on 2a either in this problem set 3 or problem set 2. i think problem said three um That every every entering recognizable language is polynomial time reducible to atm and so very similar picture and there's the a lot of analogies
here that you can draw between between uh the computability section and the complexity section anyway with that i know we're just about out of time so let's quick review um Of what we've done um here and um i will stick around for questions for a while is there a okay that's a good question is there kind of a you know a um sort of a regular reduction analogy version to mapping reducibility you know we had that we had the general reduction uh for the computability section and we had the mapping reduction for the computability section
here We're only going to be focusing now on the mapping reduction so polynomial time reduction is by assumption going to be a mapping reduction yes there is a general polynomial time reduction notion as well if you you know this is not required but if you are curious about the general reduction and how to precisely formulate that it Actually appears in uh chapter six under turing reducibility um that's the general notion of reducibility spelled out in a formal way and there's polynomial time touring reducibility as well we're not going to talk about it this in this
course um other questions does np correspond exactly to verification in polynomial time if well we have to For me to answer that as a precise question we have to have a precise definition of verification but with the right definition the answer is yes um so you can define a verifier as a polynomial time algorithm um that gives a certificate given that takes a certificate and an input to the language and will um you know Accept if they're if that certificate is a valid certificate for that input um this is actually discussed in chapter i think
nine um of the text uh but um uh you um no i'm forgetting already what's where in the book but yeah you can think of it you can think of np in terms of Verification as a definition uh is proving people np the same as proving that a problem you know actually i can even make the if you want you can post public comments too as well should have done that in other cases if is proving p equal np the same as proving that a polynomial time non-deterministic turing machine n Has a polynomial time deterministic
yeah so it doesn't mean that suppose we prove that p equals np which is the minority view i would say uh quite the small minority view that that there are some people who believe that that that that is uh entirely possible and might even be the case but that's a very small small group um But yeah if you proved p equal np that's the same as saying that every non-deterministic polynomial time turing machine is going to have a companion deterministic polynomial time turing machine which does the same language that's exactly what it bye means you