Tuesday 23 August 2016

Philosophy, Psychology and Monkey Cognition

Hey there´s this thing called the European Society of Philosophy and Psychology which has a conference every year.  The meeting this year was  at the University of St Andrews August 10-13, 2016 (see http://espp16.wp.st-andrews.ac.uk  for the conference website). 

In fact, I happen to be the Linguistics Program chair for this society/conference and so, for my sins, I have to read abstracts, think of keynote speakers and actually attend the conference every year.  Every year, I wrench myself away from my linguistics-internal concerns and duties wondering whether I really have time for this. And every year I come away from the conference thinking  "OMG, I am so glad I made time for this”.  We  linguists should make time to engage with our colleagues over in philosophy and psychology who are thinking about the very same issues  but in radically different and yes, sometimes incompatible ways.  We can make all the noises we like about building bridges with philosophy and psychology in the abstract, but unless we talk to them and go to their conferences those bridges won´t actually get built, and misunderstandings will proliferate.  In particular, clicking on and reading the occasional hyped psychology article that catches your eye does not prepare you for the whole culture of concerns and assumptions and indeed heterogeneity of approach that you find when you are actually at one of these meetings.

This year, there was a common thread running through the conference on animal cognition and primate cognition in particular,  in part due to the invitation of Philippe Schlenker, CNRS-Institut Jean Nicod (http://www.institutnicod.org/membres/membres-permanents/schlenker-philippe/?lang=en ),  as one of the keynote speakers and to the local expertise of our hosts at St Andrews in the area of primate cognition (Check out the Centre for Social Learning and Cognitive Evolution  here https://risweb.st-andrews.ac.uk/portal/en/organisations/centre-for-social-learning--cognitive-evolution(aca65ea5-18be-4425-8477-a13cdbd890c9).html )

As we all probably know,  the field of primate cognition today is largely insensitive to the overly simplistic historical dichotomies of `innate´ vs. `learned´  that inspired linguists´ early attempts to teach chimpanzees human language (and which still fills chapters in beginner textbooks on psychology of language).   A linguist looking to sign up for the cheering gallery on one side or other of that debate will be cruelly disappointed.  So its all the more important for the responsible linguist to get up to speed on  the latest knowledge that has emerged from research in this area over the past couple of decades.

First of all, it is quite convincing from the research that the great apes communicate intentionally, do social learning, use tools, have problem-solving abilities and even some hallmarks of theory of mind.  The whole issue of intention to communicate, as  tested and proved by a number of researchers, requires  at least some sort of recognition that others possess minds, and that the state of their knowledge can be affected by one´s own communicative actions (I am thinking here of papers I heard by Christine Sievers (https://philsem.unibas.ch/seminar/personen/sievers) and Thibaud Gruber (https://www2.unine.ch/compcog/thibaud_gruber) , and also Katie Slocombe´s (https://www.york.ac.uk/psychology/staff/faculty/ks553/) contribution in the invited symposium).  So the interesting question is whether there is some sort of basic groundfloor `theory of mind´ such that apes and very young children can have that, but not  the fullblown version that would allow them to pass the standard false belief test.   If there is an intermediate version, is it distinguished from the full version because:

-It ascribes knowledge to other minds rather than belief (as Jennifer Nagel was arguing http://individual.utoronto.ca/jnagel/Home_Page.html )

-It is expressive rather than genuinely perspective shifting (as Dorit Bar-On suggested http://www.doritbar-on.com )?;
-It is one-step rather than recursively specular in the way it takes other minds into account?

Are any of these distinctions themselves correlated with any of the others, or indeed with the the special linguistic capacities for syntax?

In Schlenker’s keynote address that kicked off the conference, he described in detail the sign system employed by a number of groups of monkeys and attempted to describe the truth conditions for each individual sign in a systematic way.  Schlenker was scrupulous in staying away from questions of whether we should call such systems `language’  or not, preferring rather be specific about how it seems that this particular system is working.  The interesting aspect of his proposal for the meanings deployed in the system is that they seem to involve a kind of blocking, where the informationally more specific sign blocks the use of the more general one.  If this kind of informational pragmatic choice guides the deployment of signs in monkey communities, it seems to point to at a least a limited theory of mind. Schlenker himself, in ascribing pragmatic competence to monkey groups, stopped short of claiming that they needed to possess full specular theory of mind of the kind that is assumed to lie behind our competences in standard forms of Gricean reasoning.   But if this limited kind of pragmatic effect can be seen even in these simple systems, with rudimentary acknowledgement of audience then it is important both for studies of the evolution of theory of mind, and for the architecture of the language faculty. (If you are interested in the details, see the recent issue of Theoretical Linguistics devoted to Formal Monkey Semantics for a target article written by Schlenker and his research group, with commentaries by various linguists and cognitive scientists http://www.degruyter.com/view/j/thli.2016.42.issue-1-2/issue-files/thli.2016.42.issue-1-2.xml )

What does it mean that monkeys seem to have systems that are best described via informational blocking (maxim of quantity).  Full Gricean reasoning probably involves infinite specular regress, but certain facts with respect to informational computation do not seem to require that degree of sophistication.   It can be shown that great apes exist in social groups and utilize their gestures and vocalizations intentionally with an aim to express or convey information.  But surely one can intend to warn others without having full blown theory of mind?  Some acts of expression can be purely reflexive in the sense of not being under conscious control, but it can be shown that at least some ape gestures/vocalizations are not of this type.  And pragmatic effects can only arise in the context of an audience. So there must be different levels of pragmatics corresponding to different levels of sophistication with respect to how we represent other minds.

Andrew Whiten from St Andrews gave the final keynote of the conference and he talked about the work that his group has been doing on cultural transmission within primate communities.  With respect to being social animals with socially transmitted traditions, chimps once again seem to have some restricted version of what we see in humans.  In fact, Whiten would argue that the differences we find here are slight indeed.  Chimps use tools and pass on use of those tools, as well as certain non-necessary ways of doing things to the group. Groups of chimps also seem to have a strong instinct for social conformity within the group.  Chimps are good at imitating and are rational problem solvers/learners.  Unlike the claims of the Tomasello group, the Whiten group in St Andrews has been successful in showing that chimps do have imitative and  social instincts--- they are not just emulators. (Tomasello’s group had a hypothesis to the effect that chimps merely tried to emulate `goals’  of actions they perceive others doing where that goal is attractive to them, but are sloppy about the detail when copying the means by which the emulated being is bringing about those goals. So for the Tomasello group apes were not genuine imitators in our sense.)

In fact, it can be shown that these cousins of ours, the great apes, are actually qualitatively better imitators than other monkeys.  (Ironically then, there is actually no generalized `monkey see, monkey do’, but a restricted version of that is found specifically within the great apes.)  Chimps can be taught a version of the Simon Says game (for a reward) very quickly for example, but not capuchin monkeys who are otherwise pretty smart.   But the Tomasello group is partly right too.  In experiments where a certain goal is achieved by a sequence of actions, a chimp will copy the actions to achieve the goal. But if it becomes manifest to the chimp that some of the steps are not practically necessary to the observed outcome, the chimp will miss out these steps.  Young children however, will systematically continue to repeat the useless elaborative steps, even after it has been made manifest that they do not contribute to the outcome.  This the phenomenon that is now known as `over-imitation’. They were initially discovered by the Whiten group, and have been robustly replicated.  

But maybe children are less rational at this age than the chimps, or less able to calculate physical outcomes, so they are just playing it safe?  Or maybe  both children and human adults in an experimental situation see the task as a kind of game which leads them to over imitate?  In an interesting new extension of the paradigm, the Whiten group ran a similar experiment with adults in a non-experimental situation.   They set up the same primate-tested tasks as part of a hands-on installation at Edinburgh zoo, inviting adults passing through the exhibit to `have a go’  at the tasks that had been tested on primates. The adults could watch the training video (the different conditions were cycled) and then attempt the same task on the actual equipment.  Importantly, the human adults did not know they were even being observed, although they were in fact being filmed.  Once they did the task, an experimenter came up to them and explained `candid camera’  style, asking if they were willing to sign a consent form for their data to be used (they usually did).  The amazing upshot of the study was that the unobserved fully rational adults also did the full detail imitation  even when they could judge that those extra bells and whistles had no effect on the outcome! Recall that the chimps in the same situation left out the extra bits and went straight for the prize. 

Now who would have thought that there would be so much interesting difference in the realm of imitation?  Imitation, we are told in LING 101  is the thing that language acquisition is not  about.  For good reason, since imitation alone is totally inadequate to the task. In particular, overexuberant imitation of everything would be a hopelessly huge task and would actually inhibit pattern discovery (in the jargon, compulsive imitation is not the same as over-imitation).   There needs to be selective attendance to certain aspects of what the young human is exposed to. But that is not all, I would argue.  Over and above that, the evidence seems to be that humans in certain domains attend and imitate in an overly fine-grained way, in a way that does not need to be justified by immediate practical goals.

Another difference in the imitative capacity between us and chimps, is called `ratcheting’, and I think it actually might be related to the first.  Children can easily be taught to build one learned behaviour on top of another one.   If you try to do this with chimps, they get stuck at the first stage.   Now, if you try to teach them the same sequence of actions but to a single final goal, they are capable of that, so its not the memory or extended nature of the task that is hard.  Things go wrong if you teach a chimp a behaviour that achieves a certain goal, and then, a while after they have learned that behaviour, try to teach them to suspend goal number one and use the first behaviour as a stepping stone  add a new behaviour and achieve an even bigger payoff.  Chimps should in principle be rational enough to see the advantage of this but in fact, according to Whiten, they get stuck.  No ratcheting (this is apparently the technical term but I might be spelling it wrong).   This for the Whiten team is the reason that chimp culture does not undergo cumulative advance, unlike our own.  But the discussion of the no ratcheting discovery started me thinking about how this could also be related to the acquisition of language.

Rational and goal oriented emulation like the chimps generally do ends up as low fidelity copying--- you concentrate on the outcome and try to reproduce that. 
But arbitrary and causally more opaque tasks (or tasks with non transparent or deferred payoffs) are hard to acquire if you are a goal-oriented learner.  

But what if you just take pleasure from your success at high fidelity imitation with no need of reward?  What if we humans have an instinct for learning just for the fun of it (within certain targeted domains of course, that we are predisposed to pay attention to) ?   Learning for the sake of it, non-goal oriented learning, is actually something that distinguishes us from our great ape cousins.  That would explain both the rachetting and the overimitation.  Learning is not a rational strategy for us the way it is for chimps. We get rewards just from the success of learning itself. Arguably both rachetting and overimitation are important for learning language. Overimitation because very fine grained motor imitations are necessary to start producing the differentiation in the produced code that language requires, even in the absence of immediate rewards. Ratcheting because we need to be able to build up our skills cumulatively and build hierarchical complexity in the symbolic system.  Recursion at the symbolic level requires that one can use one result as a stepping stone to the next.  The very thing that the chimps get stuck on. (Now there may be evidence that chimps use recursive reasoning in problem solving tasks, or in their vision systems like us, or whatever, but what we are talking about here is symbolic manipulation, which is crucially  mediated by learning.)  So one hypothesis might be that it is narrowly goal-oriented learning vs. and instinctive joy of imitation that is one of the crucial ingredients in what makes us special. (This latter aspect on the other hand, we seem to share with some birds).

Back to Tomasello. Tomasello’s big idea about what makes humans unique is the complexity and richness of our social structures (http://www.eva.mpg.de/psycho/staff/tomas/pdf/Tomasello_EJSP_2014.pdf ).
However, my own hunch from the research of the Whiten group and others  is that what we are seeing in chimps with respect to social structures is a difference in degree not in kind, and certainly not drastic enough to underpin the huge cognitive leap to language.  If so, then we need to look elsewhere for the crucial cognitive ingredient in my opinion. One could speculate about a kind of genetic switch that suddenly allowed recursion. But what would that be? Suppose the key is in the non-goal orientedness of the learning mechanism, which creates both high fidelity and ratcheting?

Thinking about it this way makes a nonsense of the old dichotomies of nature vs. nurture by the way---  the thing that is innate and distinctive is a way of learning. (The more we know about nature and nurture from the geneticists anyway the more those two things get blurred.)

At any rate, that is just a flavour of the ideas floating around the conference and my own thoughts on hearing them. To be clear, none of the speculations and ramblings opined in this piece would be endorsed by those real psychologists and philosophers out there, but they certainly provoked me to think them.  I can only hope my short description inspires other linguists to attend this kind of meeting in future.  The next ESPP takes place at the University of Hertfordshire. Watch this space !


Monday 14 September 2015

Allosemy---- No thanks.

On Allosemy


It seems like I am always complaining about the status of semantics in the theory of grammar. I complain when its ignored, and then I complain when its done in a way I don´t like, I complain and complain.   Today is not going to be any different.

At the ROOTS IV conference, we had a number of lexical semantics talks, which clearly engaged with meaning and generalizations about root meaning. Then we had the morphology talks.   But I´m not convinced those two groups of people were actually talking to each other.  Now, the thing about Distributed Morphology is that it doesn’t believe in a generative lexicon, so all of the meaning generalizations that are in the lexicon for the lexical semanticists have to be recouped (if at all)  in the functional structure, for DM and its fellow travellers, me included. This is not a deep problem if we are focusing on  the job figuring out what the meaning generalizations actually are in the first place, which seems independent of arguing  about the architecture.  But  there is also a danger that the generalizations that the lexical semanticists are concerned about are perceived as orthogonal to the system of sentence construction that morphosyntactians  are looking at.   Within DM, the separation of the system into ROOT and functional structure already creates a sharp division whereby meaty conceptual content and grammatically relevant meanings are separated derivationally.  This in turn can lead to a tendency to ignore lexical conceptual semantics if you are interested in functional morphemes, and to suspect that the generalizations of the lexical semanticists are simply not relevant to your life (i.e. that they are not part of the `generative system´).  To the extent that there are generalizations and patterns that need to be accounted for, we need to look to the system of functional heads proposed to sit above the verbal root in the little vP.  But more challengingly, we need to relate them via selectional frames to the sorts of ROOTS they combine with in a non ad hoc manner.  If, in addition, we require a constrained theory of polysemy, the problem becomes even more complex.  I think we are nowhere close to being able to solve these problems.  Perhaps because of this, I think that standard morphological  and syntactic theories currently do not yet engage properly with the patterns in verb meaning, by which I mean both constraints on possible meanings, and the existence of constrained polysemies.  I contend that the architecture that strictly separates the conceptual content of the root from the functional structure in a derivational system must resort to crude templatic descriptive stipulations with which to handle selection.  This architecture also obscures the generalizations surrounding polysemy.  

One of the interesting talks in the conference that was one of the few that attempted to integrated worries about meaning into a system with DM-like assumptions, was the contribution by Neil Myler. Neil was interested in tackling the fact that the verb have in English is found in a wide variety of different constructions, and he was interested in giving a unified explanation of that basic phenomenon.  To that extent, I thought Neil´s contribution was excellent, and I agreed with the motivation, but I found myself  uncomfortable with some of the particular tools he used to put his story for have  together.  The issue in question involves the deployment of  Allosemy.  

Let me first complain about the word Allosemy. It´s pronounced  aLOSSemi, right? That´s how we are supposed to pronounce it. Of course, doing so basically destroys all recognition of the morphemes that go into making it , and renders the word itself semantically opaque even though it is perfectly compositional.
I hate it when stress shift does that. 
Curiously, the problem with the pronunciation is similar to the problem I have with  its existence in the theory, namely that it actually obscures the semantics of what is going on, if we are not careful with it.

Let´s have a look at how Allosemy is deployed in a  series of recent works by Jim Wood, Alec Marantz and Neil Myler (We could maybe call them The NYU Constructivists for short). I am supposed to be a fellow traveller with this work, but then why do I feel like I want to reject most of what they are saying ??   Consider the recent paper by Jim Wood and Alec Marantz, which you can read here .

So to summarize briefly, the idea seems to be that instead of endowing functional heads with a semantics that has to remain constant across all  its instantiations, we give a particular functional head like little v  N possible semantic meanings, and then say that it is allosemic.   In other words it is N-ways ambiguous depending on the context.    This allows syntax to be pure and autonomous.  As a side effect this means that meaning can be potentially built up in  different ways, and the same structure can have different meanings. The cost?   

COST 1: In addition to  all the other listed frames for selection and allomorphy, we now have to list for every item a subcategorization frame that determines the allosemic variants of the functional items in the context of insertion. (Well, if you like construction grammar……)

COST 2:  Since the mapping between syntactic structure and meaning can no longer be relied upon, there is no chance of semantic and syntactic bootstrapping for the poor infant trying to learn their language.  I personally do not see how acquisition gets off the ground without bootstrapping of this kind.

COST 3: (This is the killer). Generalizations about hierarchy and meaning correspondences like the (I think exceptionless) one that syntactic embedding never inverts causational structure is completely mysterious and cannot fall out naturally from such a system (see this paper of mine   for discussion).

PAYOFF:  Syntax gets to be autonomous again.
But wait. We want this exactly, Why?  Because Chomsky showed us the generative semanticists were wrong back in the sixties?

And anyway,  isn’t syntax supposed to be quite small and minimal now, with a lot of the richness and structure coming from the constraints at the interface with other aspects of cognition? Doesn’t this lead us to expect that abstract syntactic structures are interpreted in universally reliable ways?

Allosemy says that the only generalities are syntactic ones. Like `I have an EPP feature’ or` I introduce an argument’. It denies that there are any generalities at the level of abstract semantics.  I would argue rather that  the challenge is to give these heads a general enough and underspecified  semantics so that the normal compositional interaction with the rest of the structure these things compose with will give rise to the different polysemies seen on the surface. Allosemy is not the same as compositionally potent underspecification.  The strategy of the Woods and Marantz paper is to go for a brute force semantic ambiguity which is controlled by listing selectional combinations.  It is perfectly clear that this architecture can describe anything it wants to. And while one might be able to do it in a careful and sensible way so as to pave the way for explanation later on, it is also perfectly clear that this particular analytic tool allows you to describe loads of things that don’t actually exist!  So, isn’t this going backwards, retreating from explanatory adequacy?


Of course, the rhetoric of the Woods and Marantz paper sounds lovely and high-minded. The head that introduces arguments (i* ) is abstract and underspecified.   The kind of thing a syntactician can love.  (There is also another version of i* which is modulated by the fact that  a ROOT is adjoined to it, and this version is the one that introduces adjuncts and is influenced by the semantics of the ROOT that adjoins to it).  However, core i* is nothing nothing new, in fact it is a blast from the past (not in a bad way, in fact).  It is just a notational variant of the original classical idea of specifier, where it was the locus for the subject of predication (as in the the classic and insightful paper by Tim Stowell from 1982, Subjects across Categories here).  And the i* with stuff adjoined to it is what happens when you have an argument introduced by a preposition. So i* is only needed now because we got rid of specifiers and the generality of what it means to be a specifier. 

So. Allosemy. Can we just not do this?  


Wednesday 9 September 2015

THOUGHTS ON ROOTS IV, NYU

THOUGHTS AFTER ROOTS IV, NYU

It’s been a while since New York, but I whisked away for vacation time immediately afterwards, from which I am only slowly recovering.  Many of you will already know that I am also on sabbatical this term, hanging out in Edinburgh, loosely affiliated with the University,  but trying to lay low.  This has in turn made August  a month of moving and organizational hecticness.  But productivity is slowly picking up.

ROOTS  IV took place in New York, June 29th- July 2nd,  the 4th meeting of its kind, organized brilliantly by Itamar Kastnar, Alec Marantz and the department at NYU and co-sponsored by NYU Abu Dhabi.  Check out the website for the panel discussion here, including a YouTube video of all the panel presentations, including yours truly here.   


Avid blog followers will recall that I expressed my fears in advance of this meeting that I might end up at the wrong party, i.e. that the workshop would largely be some kind of theory-internal Distributed Morphology discussion.  Alec debunked that notion forcefully and convincingly in his opening address. And indeed, one can see from the invited participants to this event, that we were not  all specifically classic DM-ers,  but came from a broad group made up of what Alec called `fellow-travellers’.  By this I think he meant those who broadly shared enough starting assumptions to actually get a meaningful and stimulating conversation going about details.  As a fellow-traveller, I offer some thoughts in this blog inspired and stimulated by being at this workshop and being part of the ROOTS IV event.   In the end, the conference split quite firmly into the morphologists (that group of fellow travellers) and the lexical semanticists who didn’t actually seem to be in the same conversation (but more about this in the next post).

MAJOR NEWS FLASH (FOR ME, ANYWAY)!
It seems to me that at this conference, Distributed Morphology officially acknowledged in a common and public forum that root suppletion exists. Heidi Harley’s poster child case from root suppletion in Hiaki has stood up to scrutiny and we have to just suck it up.  
The DM-ers at the conference seemed to all reluctantly agree, including Alec  (Skepticism and vocal disagreement  from Hagit Borer notwithstanding). 

Since it is a little outside my world view, I took some time to reflect on the special status of roots within DM and what work it does in the theory.  In DM, recall, Roots are the only  listed thing there at the start of the syntactic derivation.   Unlike vocabulary items, they are not  late-inserted.   They also have no syntactic features on them inherently, and they usually come in at the lowest part of the tree  (more recent approaches also allow roots to be `adjoined’  to various syntactic heads, but we put this aside for now).  Roots are the creatures that anchor the whole derivation, within the theory of  Distributed Morphology, and which are the basis for the enclosing identity within which competition for insertion can be calculated.  They are also the identity that underlies allomorphy and allosemy in particular contexts.  What the fact of root suppletion does to this system is that, previously,  an abstract phonological representation could be thought to be a stand-in for the identity represented by a particular root.  But if there is root suppletion then that is no longer always the case, and the thing that is the same across all spell-outs of ROOTs in a context has to be much more abstract than that (Heidi makes this point in the article I linked to above. In that work, she argues for a system of abstract indices to track the identities we need).   I guess this is also the reason that the paradigm people believe in paradigms. Paradigms are probably a notational variant of the abstract indices idea (a sub-list  defined by features inside a single address).

To see how this affects the whole system, consider the nice *ABA generalization that Jonathan Bobaljik has famously proposed and discussed in his book on comparatives.  (Norbert discussed this work warmly in his blog earlier this summer here ). 

*ABA  is a constraint that makes references to a particular kind of situation where syntactic features are in a particular inclusion relation, ordered in a particular hierarchy.  In this situation, if   you have a vocabulary item that can spell out a lower position but a suppletive one that spells out an intermediate position, then you cannot revert to the first item to spell out the highest node.  Thus the claim is about the correlation between possible polysemies and syntactic structures—polysemy must respect the contiguity  of the inclusion relations in syntactic structure, as a constraint on the operation of the Elsewhere Principle.   A very interesting proposal, if true.  Now, what we need to understand about this pattern is that the statement of it also relies on correctly distinguishing cases of true suppletion from other kinds of phonological variants in the vocabulary items.  We all understand and accept cases of phonologically conditioned allomorphy, where the phonological rules present and active in the language create variations on the ROOT’s abstract representation due to phonological context.  But  there are also cases of phonological readjustment rules that exist in DM, which are sensitive to morphosyntactic context (not phonology), and which are not the same as any actual  phonological rule in the language, ( or even possible rule sometimes).   These abstract readjustment rules do not count as suppletion--- crucially do not `count’  as creating a B out of an A.  Essentially, you still have an A if you `phonologically readjust’.   There are many of us who do not like ad hoc phonological readjustment rules, just to preserve the fiction of phonological ROOT identity.   But according to Bobalijk (pc), readjustment rules were crucially taken into account in reaching the *ABA generalization in the first place.  (Thanks to Peter Svenonius for pointing this out to me).   Putting this together with the previous point, consider now the fact that  root identity is no longer underwritten always by  an abstract phonological representation, but by something MUCH more abstract, like an index.  Now  we need to make sure that  we have an architecture of the kind that constructs  ROOT identity across suppletive environments, while still maintaining an internal distinction between `related’  variants and suppletive variants of the same thing for the purpose of stating the deep Bobaljik generalization.   So what gives? Are suppletive variants `the same’?  Or are they `different’ , i.e. Bs as opposed to As in Bobaljik’s generalization? 

I for one would like to give up ad hoc phonological readjustment rules in favour of straight-up variant insertion, making these kinds of variations indistinguishable from  cases of suppletion (which we can no longer run away from theoretically, if Heidi is right). But then I am in danger of losing  *ABA.  Or rather, I would have to make *ABA a bit of telling historical detritus, a morphological patterning that shows us something real, but  indirectly and not synchronically.   I would also expect in that case to see  some evidence of   pure  *ABA where one only needs to compare two distinct forms without the help of phonological readjustment rules.  I don’t control the examples from the book well enough to know how much reliance there is on those in Bobalijk’s book to make the generalization. 

But in any case, there is a real tension here I think.   If there really is a generalization concerning the mapping between insertion and syntactic structure that relies on suppletive forms being different  in an important sense, then how does that reconcile with ROOTs having an identity across suppletive variants?

Morphologists: Help?


This has gone on too long.  In my next post on ROOTS IV, I will muse on semantics and the existence of Allosemy (or not).

Thursday 18 June 2015

Anticipation: Roots

ROOTS

The recent meeting of syntacticians in Athens has whet my appetite for big gatherings with lots of extremely intelligent linguists thinking about the same topic, because it was so much fun.  

At the same time, it has also raised the bar for what I think we should hope to accomplish with such big workshops. I have become more focused and critical about what the field should be doing within its ranks as well as with respect to communication with the external sphere(s).

The workshop I am about to attend on Roots (the fourth such) to be held in New York from June 29th to July 3rd, offers a glittering array of participants (see the preliminary program here http://wp.nyu.edu/roots4/wp-content/uploads/sites/1403/2015/02/roots4_program.pdf ), organized by Alec Marantz and the team at NYU.   

Not all the participants share a Distributed Morphology (DM)-like view of `roots’,  but all are broadly engaged in the same kinds of research questions and share a generative approach to language. The programme also includes a public forum panel discussion to present and discuss ideas that should be more accessible to the interested general public. So Roots will be an experiment in having the internal conversation as well as the external conversation. 

One of the things I tend to like to do is fret about the worst case scenario.  This way I cannot be disappointed.  What do I think is at stake here, and what is there to fret over in advance you ask?  Morphosyntax is in great shape, right?

Are we going to communicate about the real questions, or will everyone talk about their own way of looking at things and simply talk past one another?  Or will we bicker about small implementational issues such as should roots be acategorial or not? Should there be a rich generative lexicon or not?  Are these in fact, as I suspect,  matters of implementation,   or are they substantive matters that make actual different predictions?  I need a mathematical linguist to help me out here.  But my impression is that you can take any phenomenon that one linguist flaunts as evidence that their framework is best, and with a little motivation, creativity and tweaking here and there, that you can give an analysis in the other framework´s terms as well.   Because in the end these analyses are still at the level of higher level descriptions, and it may look a little different but you can still always describe the facts.  

DM in particular equips itself with an impressive arsenal of tricks and  magicks to get the job done. We have syntactic operations of course, because DM prides itself on being `syntax all the way down´.  But in fact, but we also have a host of purely morphological operations to get things in shape for spellout (fission, fusion, impoverishment, lowering what have you), which are not normal actions of syntax and sit purely in the morphological component.  Insertion comes next, which is regulated by competition and the elsewhere principle, where the effects of local selectional frames can be felt (contextual allomorphy and subcategorization frames for functional context).   After spellout, notice that you still get a chance to fix some stuff that hasn´t come out right so far, namely by using `phonological´ readjustment rules, which don´t exist anywhere else in the language´s natural phonology.  And this is all before the actual phonology begins. So sandwiched in between independently understood syntactic processes and independently understood phonological processes, there´s a whole host of operations whose shape and inherent nature look quite unique. And there´s lots of them. So by my reckoning,  DM has a separate morphological generative component which is different from the syntactic one. With lots of tools in it.

But I don´t really want to go down that road, because one woman´s Ugly is another woman´s Perfectly Reasonable, and I´m not going to win that battle. I suspect that these frameworks are inter translatable and that we do not have, even in principle, the evidence from within purely syntactic theorising, to choose between them.

However, there might be deep differences when it comes to deciding what operations are within the narrow computation and which ones are properties of the transducer that maps between the computation and the other modules of mind brain.  So it´s the substantive question of what that division of labour is, rather than the actual toolbox that I would like to make progress on.

To be concrete, here are some mid-level questions that could come up at the ROOTs meeting.

Mid-Level Questions:
A. Should generative aspects of meaning be represented in the syntax or the lexicon? (DM says syntax)
B.  What syntactic information is borne by roots? (DM says none)
C. Should there be late insertion or  should lexical items drive projection? (DM says late insertion)

Going down a level, if one accepts a general DM architecture, one needs to ask a whole host of important lower level questions to achieve a proper degree of explicitness:

Low-Level Questions
DM1: What features can syntactic structures bear as the triggers for insertion?
DM2: What is the relationship between functional items and features? If it is not one-to-one, can we put constraints on the number of `flavours` these functional heads can come in?
DM3: What morphological processes manipulate structure prior to insertion, and can any features be added at this stage?
DM4: How is competition regulated?
DM5: What phonological readjustment rules can apply after insertion?

There is some hope that there will be a discussion of the issues represented by A, B and C above. But the meeting may end up concentrating on DM1-5.

Now, my hunch is that in the end,  even A vs. B vs. C are all NON-ISSUES. Therefore, we should not waste time and rhetoric trying to convince each other to switch `sides’.  Having said that, there is good evidence that we want to be able to walk around a problem and see it from different framework-ian perspectives, so we don’t want homogeneity either. And we do not want an enforced shared vocabulary and set of assumptions.  This is because a particular way of framing a general space of linguistic inquiry lends itself to noticing different issues or problems, and to seeing different kinds of solutions.   I will argue in my own contribution to this workshop on Day 1, that the analyses that adopt as axiomatic the principle  of acategorial roots prejudges and obscures certain real and important issues that are urgent for us to solve.  So I think A, B and C need an airing.

If we end up wallowing in DM1-5 the whole time, I am going to go to sleep.  And this is not because I don’t appreciate explicitness and algorithmic discipline (as Gereon Mueller was imploring us to get more serious about at the Athens meeting), because I do. I think it is vital to work through the system, especially to to detect when one has smuggled in unarticulated assumptions, and make sure the analysis actually delivers and generates the output it claims to generate.   The problem is that I have different answers to B than in the DM framework, so when it comes to the nitty-gritty of DM2,3 and 5 in particular, I often find it frustratingly hard to convert the questions into ones that transcend the implementation.  But ok, it’s not all about me.

But here is some stuff that I would actually like to figure out, where I think the question transcends frameworks, although it requires a generative perspective. 

A Higher Level Question I Care About
Question Z.  If there is a narrow syntactic computation that manipulates syntactic primes and  has a regular relationship to the generation of meaning, what aspects of meaning are strictly a matter of syntactic form, and what aspects of meaning are filled in by more general cognitive processes and representations? 

Another way of asking this question is in terms of minimalist theorizing. FLN must generate complex syntactic  representations and semantic skeletons that underwrite the productivity of meaning construction in human language. What parts of what we traditionally consider the `meaning of a verb’  are contributed by (i) The narrow syntactic computation itself, (ii) the transducer from FLN to the domain of concepts (iii) conceptual flesh and fluff on the other side of the interface that the verb is conventionally associated with. 

Certain aspects of the computational system for a particular language must surely be universal, but perhaps only rather abstract properties of it such as hierarchical structuring and the relationship between embedding and semantic composition. It remains an open question whether the labels of the syntactic primes are universal or language specific, or a combination of the two (as in Wiltschko’s recent proposals). This makes the question concerning the division of labour between the skeleton and the flesh of verbal meaning also a question about the locus of variation. But it also makes the question potentially much more difficult to answer. To answer it we need evidence from many languages, and we need to have diagnostics for which types of meaning we put on which side of the divide.  In this discussion, narrow language particular computation does not equate to  universal. I think it is important to acknowledge that. So we need to make a distinction between negotiable meaning vs. non-negotiable meaning and be able to apply it more generally. (The DM version of this question would be: what meanings go into the roots and the encyclopedia as opposed to meaning that comes from the functional heads themselves).

There is an important further question lurking in the background to all of this which is of how the mechanisms of storage and computation are configured in the brain, and what  the role of the actual lexical item is in that complex architecture.  I think we know enough about the underlying patterns of verbal meaning and verbal morphology to start trying to talk to the folks who have done experiments on priming and  the timing of lexical access both in isolation and integrated in sentence processing.   I would have loved to see some interdisciplinary talks at this workshop, but it doesn’t look like it from the programme. 

Still, I am going to be happy if we can start comparing notes and coming up with a consensus on what we can say at this stage about higher level question Z. (If you remember the old Dr Seuss story, Little Cat Z was the one with VOOM, the one who cleaned up the mess).


When it comes to the division of labour between the knowledge store that is represented by knowing the lexical items of one’s language, and the computational system that puts lexical items together, I am not sure we know if we are even asking the question in the right way.  What do we know of the psycholinguistics of lexical access and deployment that would bear on our theories?  I would like to get more up to date on that. Because the minimalist agenda and the constructivist rhetoric essential force us to ask the higher level question Z, and we are going to need some help from the psycholinguists to answer it.  But that perhaps will be a topic for a different workshop.