Over a decade ago, before I got a PhD in linguistics, when I was an aspiring polyglot and the only YouTube language learning content was basically Alexander Arguelles marching around talking to himself or Laoshu accosting people in the mall, I remember hearing that Noam Chomsky, the political scientist, has a crazy theory about language, that they’re all basically the same on some deep grammatical level. PREPOSTEROUS! I thought.

Anyone who has studied a foreign language knows that they do things differently! In some languages adjectives precede the noun they modify, in others they go after! In some languages you move question words to the front of a sentence, in other languages you don’t!

What kind of monolingual fool would propose something so absurd? Of course, now, I realize that I was the fool. But the Chomsky’s actual idea is still not widely understood outside of academia.

Or within it, if we’re being honest. Instead, a lot of us, my past self included, argue with a frankly stupid straw man. And there’s this other really weird twist about UG, for historical reasons.

So today I want to get into what Deep structure, and universal grammar, and generative syntax is all about — without getting into the academic flame war over which specific theories are correct — and I also want to talk about this one REALLY WEIRD fact about academic linguists’ approach to syntax. I’m Doctor Taylor Jones, and this is is the man who sees himself happy? Uh.

. . This is language jones.

[INTRO] Ok so obviously different languages do things differently. And contrary to what every aspiring polyglot with an internet connection will tell you on YouTube, Chomsky’s idea is not that basically all languages are the same but they get transformed into one another. THAT is tin-foil hat level crazy.

But there’s a reason Chomsky’s work revolutionized linguistics. And computer science. Which actually plays a huge role here.

So here’s the deal. In the last century BC — before Chomsky — linguists did a lot of writing of descriptive grammars, like this one. It kind of tells you all about the different structures of a given language.

And language acquisition was sort of humanities, and psychology, and all over the place. A leading theory of childhood language acquisition was just that kids mimicked what they hear and basically repeat what they have heard. Chomsky rose to prominence with an EPIC takedown of B.

F. Skinners work on behaviorism, where he basically points out that, to put it in layman’s terms, the math on that just ain’t mathin’. Native speakers produce sentence structures they’ve never heard before, and they aren’t exposed to enough language to be able to just learn all the structures through rote memorization and repetition.

Chomsky was at the forefront of the fledgling field of computer science, and he basically was like listen. Kids have all sorts of developmental stuff just built in. They can’t roll over, and then boom, one day they can.

They can’t crawl, and then boom, one day they can. As a father I’ve even seen this with language, where my daughter was babbling with 3 specific consonants, took a hard two hour nap, and unlocked like 3 more. It’s like she just downloaded nasals in a sleep cycle.

Anyway, he proposed that language acquisition is similar, in that there’s a mental module for language. He fucked up and called it a “language organ” in a metaphor that has been a problem for people ever since. It’s not an organ that’s got a physical location, any more than being able to recognize yourself in the mirror is.

So his whole idea is that YES, languages are different, but honestly, they could be way MORE different and they aren’t. No language forms questions by making you say all the sounds of a declarative sentence backwards, or having you pronounce every other word first and then the odd words last. But in principle, this is possible.

It’s computable. And yet it’s totally outside the realm of possibility in real languages. Instead, Questions rely on (1) declarative sentences with different intonation, where no words are in different places, and (2) sentences that look like the declarative sentence, but with words moved, always in basically the same patterns.

So it looks like there’s a system of rules in each individual language whereby we can actually produce infinite sentences, including sentences we’ve never heard before, if we know how the rules fit together. This should make intuitive sense. Think about the last time you heard a truly surprising sentence.

It’s not all repetition of the same learned material, like “please take a moment to like, share, subscribe, and ring the bell for notifications. ” And different languages seem to follow the same kinds of patterns, but with slight variations. Like a different parameter setting.

STORY TIME: I actually independently re-derived this trying to study foreign languages more effectively. So, what had happened was — I thought, let me get some basic words, like Gabriel Wyner’s 625 important words list. But it would be great if I could study them in sentences.

So why don’t I just dump them all into a dictionary in python, and then have a program just make sentences for me to translate? But in order to do that, there’s some considerations. I’d have to basically make a class for verbs, and instantiate different verbs with different settings.

Like, “give” would need up to three nouns to go with it (I gave my wife a book). But “saw” would have two (I saw the man). And then I could do things like add a relative clause (I saw a man who looked like a goat).

And as I started thinking through how I would construct a python class for verbs, and nouns, and so on, I thought: I’ve read something where somebody already did this. And then it hit me. It was ASPECTS OF THE THEORY OF SYNTAX, which Chomsky wrote in 1966.

25 years before python was created. Check it out. He uses the example sentence “sincerity may frighten the boy.

” It’s worth quoting his statement of the problem. He says concerning this sentence, a traditional grammar might provide mation of the following sort: (i) the string (1) is a Sentence (S); frighten the boy is a Verb Phrase (VP) consisting of the Verb (V) frighten and the Noun Phrase (NP) the boy; sincerity is also an NP; the NP the boy consists of the Determiner (Det) the, followed by a Noun (N); the NP sincerity consists of just an N; the is, furthermore, an Article (Art); may is a Verbal Auxiliary (Aux) and, furthermore, a Modal (M). (ii) the NP sincerity functions as the Subject of the sentence (1), whereas the VP frighten the boy functions as the Pred.

icate of this sentence; the NP the boy functions as the Object of the VP, and the V frighten as its Main Verb; the grammatical relation Subject-Verb holds of the pair (sincerity, frighten), and the grammatical relation Verb. Object holds of the pair (frighten, the boy). 1 (iii) the N boy is a Count Noun (as distinct from the Mass Noun butter and the Abstract Noun sincerity) and a Common Noun (as distinct from the Proper Noun John and the Pronoun it); it is, furthermore, an Animate Noun (as distinct from book) and a Human Noun (as distinct from bee); frighten is a Transitive Verb (as distinct from occur), and one that does not freely permit Object deletion (as distinct from read, eat); it takes Progressive Aspect freely (as distinct from know, own); it allows Abstract Subjects (as distinct from eat, admire) and Human Ob.

jects (as distinct from read, wear). Then he just straight up says “The main topic I should like to consider is how information of this sort can be formally presented in a structural description, and how such structural descriptions can be generated by a system of explicit rules. ” So literally the whole idea is how you get sentences that “work” as sentences from understanding a set of procedures — and I use that word intentionally, because we’re doing computer science — and how a finite set of rules combined with a finite lexicon can GENERATE sentences.

His goal was to come up with a set of rules — a grammar — that can generate all the sentences of a language, and that won’t overgenerate with weird stuff. And the real question is, can you do this with some parameters that you can tweak, and be able to generate the structures of all, and only, the sentences of ANY human language. And it turns out, this is a really fruitful avenue of investigation.

Not only can you pretty obviously take care of a ton of observed phenomena with a handful of rules — like you either can or can’t just not pronouns pronouns, or you do or don’t move wh-operators like who, what, where from the place they’d normally sit in a declarative sentence with a real noun there (I saw him vs I saw who vs who did I see). Not only can you do that, but where the problems get thornier, we end up making discoveries that explain a lot of cross linguistic phenomena. I won’t go into the weeds here, but unergatives and unaccusatives are a relatively new “discovery” and they explain things like why a lot of verbs in French form auxiliaries with “have” but a specific handful form them with “be” instead.

And like, a main takeaway from Chomsky’s work is hierarchical structure. That when we combine things, they take on the flavor of ONE of the things. Here’s an example.

I can refer to “a shoe. ” That’s a noun. And if I add another noun, the whole phrase stays a noun.

A clown shoe. But when I add an adjective, what then? Well, the WHOLE PHRASE stays what the “head” of the phrase is: in this case, a big red clown shoe.

And you can do noun-y stuff to the whole phrase, like replace it with the pronoun (“I refused to wear IT”) Or even have it license an anaphoric, like “it crumpled in on itself. ” You can also put the same category of thing INSIDE itself, so it’s recursive. But that’s a can of worms for another video.

This is not going be a full syntax class. Lord knows that’s not my specialty anyway. And if you want to learn more, I’ll probably include this in the “Intro to linguistics for regular people” class I’m putting together for YouTube — leave me a comment if you’re interested in that.

But I think this stuff is really important because it gets misrepresented, and frankly, it’s really useful for language learners, and just interesting in its own right. PLUS it helps clarify WHY linguists say things like “African American English” or any other nonstandard dialect are actually “rule governed” and valid and legitimate. The rules aren’t “don’t end a sentence with a preposition” — that’s a stylistic preference that rules out the most common way to answer a phonically here in Harlem: where you at.

But there ARE rules, which is why nobody asks AT WHERE YOU. Now, my coding excursion brings me to the weirdest thing about linguistics, in my mind. That is, this is basically a computer science problem.

But linguists, ESPECIALLY syntax people, are practically computer phobic. They’re allergic to coding. So they do all of this with pencil and paper, and like, their thoughts and shit.

It feels like you could take a hypothesis about linguistic structure (say that noun phrases should really be though of as subordinate to determiner phrases, or that verb phrases need two verbal elements to capture all the cross linguistic generalizations — what people are calling big V and little V), and you could implement it in Python in an afternoon, run a bunch of tests, and then debug and continue from there. It is completely mind boggling to me that this is, in fact, NOT the standard practice. Especially since I couldn’t make heads or tails of minimalist syntax UNTIL I learned computer programming.

And when I went back to read Chomsky’s early work, he was explicitly doing computer science. To be fair, there are some who DO actually do that, and it seems to be the approach to generative syntax that makes the most sense to me…things like Probabalistic Context Free Grammars, and my favorite, Stochastic Tree Parsing Grammars. But the dominant strand of syntactic inquiry [IMAGE: SI] is still pencil, paper, and thinking hard about how to add another epicycle — uh, binary branching projection — into the tree.

Lastly, Chomsky and his followers seem to be terrible at branding. I have colleagues who WORK IN A GENERATIVE PARADIGM, explicitly assuming that the brain can use finite rules to generate infinite linguistic structures constrained by those rules, who would say they are NOT “generativist” or even “anti-chomsky”. But in reality, it’s nit picking one or another minor details, or misunderstanding word choice.

Here’s one: practically every sociolinguist I know can be lured into a frothing-at-the-mouth soapbox rant about Chomsky writing about an “ideal” speaker-hearer. But he doesn’t mean it like it’s what a language speaker SHOULD be like. In context, it’s clear he means “ideal” to mean “abstract.

” Like. . , this is an “idea.

” The amount of butthurt whining from people with literal PhDs in linguistics because they legitimately misread the meaning of a word is just absolutely shocking. Or at least, it was when I first got into academia. It’s sort of sadly predictable now.

Anyway, next time you see a YouTube language enthusiast trash talk Universal Grammar, it’s worth asking if they’re really just arguing with a straw man. Different people disagree with various details, and some of us, myself included, prefer to focus our time on other linguistic questions, but UG is NOT the claim that all languages are the same — it’s a research program designed to get at whether all people have the same linguistic capacity. Which seems to be the case.

If you what I’m doing with the channel, please like, subscribe, and above all, comment. I have a patreon. If you’d like to become a patron, you can do so at www.

patreon. com/langaugejones. You can also support the channel directly with superthanks, and supechat on my livestreams.

And if you liked this video, YouTube thinks you’ll probably like this one. Until next time, happy learning.