Interview with Prof. S.R.S. Varadhan

Interview Editorial Consultant: Tai-Ping Liu
Interviewers: Tai-Ping Liu (TPL), Chii-Ruey Hwang (CRH), Tzuu-Shuh Chiang (TSC), Shih-Hsien Yu (SHY)
Interviewee: S.R.S. Varadhan (SRSV)
Date: December 19th, 2007
Venue: Institute of Mathematics, Academia Sinica, Taipei, Taiwan

Srinivasa S. R. Varadhan was born on January 2, 1940 in Madras (Chennai), India. He received his B.S. in 1959 and M.S. in 1960 from the University of Madras and PhD in 1963 from Indian Statistical Institute. Since 1963, he has been a faculty at the Courant Institute of Mathematical Sciences, New York University. He is known for his fundamental contributions to probability theory and for creating a unified theory of large deviations. He is a member of National Academy of Sciences (1995), and awarded Leroy Steele Prize (1996), Abel Prize (2007), and National Medals of Sciences (2010).

TPL: Thanks for coming. How did you get interested in mathematics while in India?

SRSV: I think you know, in elementary grades...and so on, mathematics is really a question of arithmetic…if you can add and subtract and multiply…quickly, then you were really good. That’s all that’s expected of you. You can remember the tables. So that’s fine. That I was able to do, and so I managed to do well in the exams and I was considered to be a good student in mathematics, compared to others who couldn’t remember multiplication tables.

TPL: (laughs) But in India, I was told that the multiplication tables are not nine by nine.

SRSV: No, it’s up to 16.

TPL: Sixteen by sixteen

SRSV: Yeah. That’s because you see we had the British monetary system, in which one rupee is sixteen anna (Indian currency) and one anna is divided into twelve paisa. (Indian currency). So, 12 and 16 are important if you’re doing any transaction. So you should be able to divide quickly by 12 and 16. If you buy so many apples etc. Tables of 16 and 12 are important. So some schools what they did was, you went up to 12 and then you jumped to 16. So for 13, 14 and15, they didn’t force you to memorize, but up to 12 and 16, you definitely must know.

TPL: When did you memorize sixteen by sixteen, by what age?

SRSV: Probably fourth grade. Of course, once you go to high school it’s different and then actually you had quite a bit of mathematics. We had algebra, solving simultaneous equations and Euclidian geometry. And no calculus. Sort of calculus and all the trigonometry, all the sine and cosine. All this we had in high school. That was not hard.

TPL: So you did not jump ahead of the class, you did not learn calculus in high school?

SRSV: It was not taught. .

TPL: Uh-huh, so you just go along with the rest of the class.

SRSV: There’s no special class, inevitably no electives, so you are all in the same class, you learn the same thing, but I had a good math. teacher and what he would do was make us come to his house on weekends and give us problems, just to work some problems in geometry, solve some problems in triangles and things like that. So we had some familiarities with solving problems. If somebody gave us a problem, we knew how to attack it. You look at it from different angles, try to understand what the issue is in the problem and then see if you can solve it. He gave us problems in geometry; draw triangles things like that. That experience was good.

TPL: I see.

SRSV: Then after high school you go to junior college for two years. In junior college we studied mathematics with Physics and Chemistry as well as two languages.

TPL: So what two languages did you study?

SRSV: English and Tamil, which is my native language.

TPL: I see.

SRSV: Physics and Chemistry I had no problem. Math., I had no problem. But we didn’t do really too much math. What did we do? We did a bit of projective geometry, conic sections and things like that, and then little bit analytical geometry; straight-lines, circles. A little bit of calculus just differentiation and integration, sines and co-sines, no exponentials or logs.

TPL: Let me jump ahead. Now you’re talking about everything as usual, it’s not difficult for you. But there was a story that Papanicolau told me, which is that Kolmogorov came to India and people introduced you to him, but that was some years later, right?

SRSV: That was at the end of my graduate studies.

TPL: So that was several years later.

SRSV: Now I’m talking about ’56, when I was 15 or 16 years old. I was just finishing junior college.

TPL: So you were ahead for your age, no?

SRSV: Yeah…by a year or two. So up to then, mathematics was fairly unsophisticated. Mathematics was what you would call pre-calculus, I think our calculus lectures were just a few weeks, two or three weeks in the end. In junior college, you don’t have several courses, just a course called “Mathematics” with one professor and whatever math he teaches, he teaches. First semester he would teach projective geometry, second semester he would teach something else, third semester something else. And finally fourth semester you would come to calculus, towards the end, and maybe the last few classes. The calculus was a very thin book, just twenty pages or so, ten pages of differential calculus and ten pages of integral calculus.

TPL: Uh-huh.

SRSV: And I didn’t quite get what it was about, at that time. I mean, you know…I knew how to do it, but I didn’t see the point of it because there weren’t any real applications given, it was just the end, the limit was defined, and somehow they canceled some h’s somewhere, and produced derivative functions, and then you scratched your head to see what the integral was. And that was basically it. But then, I went to undergraduate program, a real undergraduate program which is three years, in pure mathematics, probability and statistics.

TPL: So this is English system?

SRSV: And that’s where we learned different things. And we had nothing else, no physics and no chemistry. There’s a little bit of English once a week in the first year. So three years of nothing but mathematics, probability and statistics. And in mathematics we had several parallel lines. We had an analysis line, we had a geometry line and an algebra line. Three streams. In analysis we started with differential calculus and integral calculus the first year, and a little bit of multi-variable calculus and foundations of real analysis the second year, and the third year was multiple integrals, Jacobians, transformations and a little differential geometry. In algebra, we started with theory of equations, determinants, matrices and linear algebra. That was about it. And then geometry has a lot of analytical geometry, conic sections, solid geometry, some more projective geometry and…so these are the three main streams of pure mathematics.

TPL: And you took all of them?

SRSV: I took all of them. There’s no choice for anybody! There were 13 in our class and every class for three years we sat together. There’s no choice! And they tell you these are the courses you have to take at such a time and it’s like school. You go at 9 o’clock and stay in school till 4PM. And parallel to this we had probability and statistics. In the first year we learned basic probability, limit theorems and descriptive statistics. Actually, it’s a little bit strange because there are a few pure math things you needed for probability which were not covered in pure math till later and they had to do it in probability. For instance Fourier Transforms. We covered a lot of theoretical statistics, and also some descriptive statistics like sampling. So that was for 3 years and I completed my undergraduate education.

TPL: So you were about 18 or 19 at that time?

SRSV: Eighteen…going on to nineteen.

TPL: Ok.

SRSV: So the three- year program gets you ready for master’s degree. So at this stage, you know, you get a BSc (Hons) degree, but after six months it morphs automatically into a master’s degree, you don’t have to do anything. Then, I went to Calcutta for my graduate program. So it’s after my Master’s degree.

TPL: Ok Calcutta for graduate, but undergraduate was at ?

SRSV: Chennai, Madras then, which is now Chennai.

TPL: Ok, and then something must have happened at the graduate school. Somewhat not ordinary.

SRSV: Well, I knew my probability quite well. But I didn’t really know any measure theory at that point. There was really no modern mathematics taught at that time. We were really well taught in classic analysis, we could estimate things and so on. But, we were never taught abstract thinking, or functional analysis, measure theory or point set topology. These things were not even touched upon. And so when I came to Calcutta, in the graduate program there were no courses. You do your fellowship. You are given a desk, a chair and asked to produce a thesis in three years. So there are just some seminars, some rigorous seminars on measure theory and point set topology. Ok, I sat through it.

TPL: No homework?

SRSV: No. It wasn’t a regular class, it was a seminar. So if you don’t use it, it just goes away! So, it more or less went away. And then one day somebody told me why measure theory was important. See until that time I had no idea that there were singular measures, the notion of absolute continuity was not emphasized. Although we had gone through three years of probability, statistics and so on, we always dealt with probability distributions that were discrete or given by a density. Practically all problems involved one type or the other, so why does it matter if there are singular distributions, you never see them, so who cares! So somebody pointed out in graduate school that these objects exist, so I was surprised and then I realized the importance of measure theory. You know, and at time I went back to study the book by “Halmos” again. Originally it was not my intention to do mathematics, but to do statistics. I wanted to work with applied statistics, become a statistician working in some kind of industry. The idea that you could do research in mathematics for a living was simply not conceivable. But slowly, after three or four months I was disillusioned, If I were here I would probably say I was depressed. But in India you are surrounded by people. I was lost in a way. I thought it was going to be exciting, I was doing some industrial statistics and it seemed awfully dull. I couldn’t see why. Sure you needed to do it, but you don’t have to study something to do it, you just used common sense mostly. And somebody suggested I should work on operations research. I worked a bit on operations research and it was mostly computing maxima and minima under a different name. So the applied areas sort of didn’t interest me that much. So I ended up playing a lot of bridge, reading mystery books. Since there were no classes and nothing was expected of you, you could go to the office, go the library and read a mystery book. I did this maybe for four or five months. Then I met with some other graduate students who were a year or two ahead of me, Ranga Rao and Parthasarathy. Varadarajan who led the group had left for USA by then. They were working on limit theorems, the interplay between measure theory and topology and limit theorems in abstract spaces. So I got interested in that and it seemed like an exciting thing to do. And that forced me to learn, to put to use the measure theory and topology I had learnt. I integrated both into something concrete. For example I learnt the invariance principle of Donsker and the work of Prohorov and Skorohod on weak convergence. And then we ran seminars on our own and some regular courses on limit theorems. Just three or four graduate students, we sort of gave courses for ourselves, we all stayed together and we shared an apartment together. We would get up and went to the office at seven in the morning, had our seminar between seven and ten, three-hour seminar just went right through, and this is day in day out, every day…that’s how we learned! So in that period I learned a lot. Very quickly I learned some group theory, functional analysis and so on. So we decided to work on, tried to generalize some work of Gnedenko and Kolmogorov on limit theorems to topological groups. So that was our starting point. After we did that we generalized it to the context of Hilbert space which adds a complication. When you are working in locally compact groups, characteristic function or Fourier transform is a good tool, but in the Hilbert space context characteristic functions are not enough, so we had to develop tools to handle it, by using concentration functions. That was an exciting period, and three years ended that way and I had a thesis ready.

TPL: That sounds incredible.

SRSV: That was the time Kolmogorov came, I had just finished my thesis. He came to visit, so had some interaction with him at that time. Basically I gave a lecture on my thesis and he was in the audience. He was asked to be an examiner on my thesis.

TPL: I see. Can I just back track a little bit? You said you had a period of 5-6 months, not knowing what to do. Before that you were thinking of doing something concrete, can be applied and you find it dull. But the period afterward you go to something abstract. But the period before that was not wasted in a way, right? Because then you were trying to find, some…

SRSV: Well…you know, let me put it this way. I think the semester there starts in August, so I went in August. So the first couple of the weeks of the semester you’re just settling in to the new town, the language is new, everything is new. You’re in a foreign land for all practical purposes. The food is different, everything is different. The Institute is in a very isolated place. Although there were small communities where people coming from the south lived, these communities were far away from where the Institute was. It takes an hour and a half to get there. That was one thing; and so I think by January or February already I became interested in mathematics. You know, Christmas holidays in between…we are not talking about such a long time!

TPL: You worked on all these abstract things, but always have something concrete to express. It’s unlike a pure math student going into an abstract forest, to follow the dogma, continue the…

SRSV: You know, at least in my undergraduate and my first thesis work and so on, what we did was we took a problem that was solved for the real line, and to see if it’s possible to generalize this to the context of a locally compact group. Turns out, if you think about it, you see the proofs and so on, you see what the difficulties are. One of the difficulties is the notion of the centering. It turns out in the work of Kolmogorov you had to be able to center things. So even if you had a really small distribution, and a really small mean, you still had to correct it. So you have to be able to define the mean. On the real line you can truncate and integrate. But how do you define it on a general locally compact group? How do you define means? Doesn’t have to be exact, but it must be approximate enough to work for centering. That was a problem we had to solve. It turns out for various reasons that if the group is disconnected then you don’t have to center because it’s not needed. So you only needed it for connected groups. A connected group is the direct sum of a vector space and a compact connected group. You know how to do it for vector space, so what you need to do is to be able to define the mean for a compact connected group. How do you define the mean, and that was problem you had to solve. And what it means is basically you have the character of a group, you have to take the log of the character somehow, because the log of the character is something linear and you integrate the log of the character that’s the mean. The character is complex valued and when you take the log it becomes a multi-valued mess. But it turns out that somehow you can do it locally and piece it together and you had to go through a lot of mumbo jumbo to define it. Once you define it, you have the notion of centering. You can then more or less proceed along the lines of Gnedenko-Kolmogorov and with a little bit of modification it works.

TPL: You did not get help from people around you?

SRSV: Just the graduate students.

TPL: That’s incredible. Can you describe Kolmogorov as a person, all the years that you know him.

SRSV: I really didn’t know him that well. He was there for a month, and saw him more or less every day. We traveled with him in India and so on, and he was very energetic, he always ran here and there and I think he was 59 when he came. And we had a birthday party for his 59th birthday. He always liked young people around him and liked to discuss things and I asked him some questions. Part of the problem was he claims he did not speak English, only French, Russian, or German. And I knew none of them. So we had to sort of manage, but with mathematics, we could manage somehow.

TPL: Ok. And then you went to Courant?

SRSV: It was in April ‘62, that Kolmogorov visited, and my thesis was ready. He didn’t want to give a report then and there, he wanted to take the thesis back to Moscow, show it to Prohorov and then give a report, and so he went away. And then in India you submit your thesis and then it may take up six months or a year to get your report, so I joined the faculty as an instructor for a year, ’62-‘63.

TPL: In Calcutta?

SRSV: In Calcutta. And that’s where I got interested in Markov processes. We worked on Markov processes, studied the work of Dynkin, the work of Ito, and we had a seminar on it. And then I was beginning to feel that I needed to learn a bit more differential equations to really understand these things. We had one course in ODE as an undergraduate, we never really had a PDE course on elliptic theory, parabolic equations or similar things and I had no idea what they were. So when you start reading Markov process, you see the old work of Feller and others, they write down a diffusion equations, they assume certain things are differentiable and write it down, and then say that the equation has a solution and something happens, but it’s not clear why the solutions exist and are differentiable. It was clear that there were theorems because they refer to some papers and so on, but I didn’t know where to start. But of course if we had time, we would have probably learnt it, but by then, the group was already starting to disintegrate, in the sense that...

TPL: Who formed that group? Were they all people of your age?

SRSV: There were five people altogether, but we weren’t all there at the same time.

TPL: All young people…

SRSV: There were maybe…I was the youngest of the group and Varadrajan was the most senior. Varadarjan is now at UCLA. He was three years senior to me. We went to the same college and undergraduate program. Already as an undergraduate he knew what he wanted to do. So he prepared himself very well. So anyway, he was there for the first three or four months and then he left to come to the U.S. Then he came back and we overlapped during my last year. And at that time we actually worked on Lie groups for a year. By that time, Ranga Rao, Setthuraman and Parthasarathy had left. But during the first 3 years, Ranga Rao was there for two years, Sethuraman was there for three years and then he left and Parthasarathy was there for three years and then he left to go to Russia for a year. So we had various groups at various time. But at the end of third year basically, our joint seminars and so on had tapered down because I was the only person left and Varadrajan came back we started to work on something different, Harishchandra’s work on Group Representations.

TPL: You were the one, who pulled those people together.

SRSV: No, I wouldn’t say that. I think we all played a role.

TPL: But you were the most enthusiastic member?

SRSV: Well…no, no. I wouldn’t say that…

TPL: Ok.

SRSV: You know I learned a lot from them, for example Ranga Rao had studied group theory and he knew topological groups and etc., because he had studied these topics with Varadarajan earlier and so he gave lectures on that. Sethuraman worked hard to learn functional analysis and he was just reading from Dunford and Schwartz and giving lectures on that.

TPL: That’s pretty good (laughs)

SRSV: Parthasarathy was reading information theory and gave lectures on entropy and dynamical systems and so on, so the four of us said: “I’ll read this, you’ll read that and we’ll pass it out and then give talks. But the nice thing about giving such talks is that if you don’t understand something that doesn’t stop you, you just come to that point and say: “I don’t understand,” and there were four of us there trying to figure it out. So you don’t have to be a master to be giving a talk.

CRH: Dr. C. R. Rao was your advisor. What was his influence on you?

SRSV: Moral support.

CRH: Moral support. So not on statistics.

SRSV: You know, I told him in the beginning that I was interested in doing statistics and he said go talk to somebody on quality control. And I said let me look at quality control for a little while and you know, all they do is what you call power they call operating statistics and other than that I didn’t see what they were doing. So I was disappointed, and I went back to him, but then he found out I was getting involved in the mathematics group and he was quite happy with that. And I think toward the end at some point he made me come to his office and explain what my thesis was. So I said: “This is the problem; this is what I have done…” and that was it.

TPL: I see. So, what year was that? After the seminar, at the point people went to different directions, what year was that?

SRSV: At the end of ‘62. Academic year ends in May or June

TPL: Ok, but you stayed in Calcutta for another year

SRSV: For one more year. Varadarajan came back that summer. So the two of us were there. And we were studying some group representations and I was studying some Markov process on the side with the idea of working on it. And Varadarajan had spent the last year before his return at Courant. He said if I am interested in differential equations I should go to Courant. He wrote to Professor Peter Lax suggesting that I come to Courant as a visitor.

TPL: ’63?

SRSV: ’63.

TPL: And then you stayed for over forty years.

SRSV: (after some calculating). Yeah, my 45th year. Come this summer it will be 45 years.

TPL: There are many things I would like to know about during your years at Courant, but maybe we should turn to something more mathematical.

TSC: French school in probability, is very different from the United States, the styles are very different. Does French school have any influence on you?

SRSV: The French school changed a lot over the years. When Neveau came, he was more interested in concrete problems, before him Fortet and Mourier were more interested in abstract theories and didn’t often have relevance for concrete problems. But there was a strong tradition for working in concrete problems. Paul Levy for example, who inspired Ito to do what he did. But somehow the French mathematical community did not appreciate Paul Levy that much. He was at Ecole Polytechnic, so he was really an engineer. What he was doing was really engineering and not mathematics. It was sort of a snobbish viewpoint of the French. And that has died down. Does not exist anymore. When Jacques Louis Lions came on the scene he made differential equations and applied mathematics much more legitimate and respectable. In France, it has changed. Now the French probabilists are no different from the rest of us, instead of very abstract, it’s very concrete.

TSC: It’s more unified, all the world is. How would you comment on the Russian school—is there such thing as a Russian school in probability? (Mentions several Russian names) (Dynkin, Kolmogorov and Prohorov)

SRSV: I think the integration of topology and measure theory to develop a point of view, I think the Russian school contributed a lot. In the study of stochastic processes Kolmogorov played a very important role and in the development of Markov processes, Dynkin played a fundamental role. Before that, Feller had done some work. If you read Feller’s work, it’s all quite confusing. There are theorems here and there, but there isn’t a coherent point of view, and I think it’s Dynkin who provides that. So I think…but then politics came, you see. The anti-Semitic feeling that developed in the ‘70s, essentially dispersed them. People like Freidlin, Dynkin, Skorohod, Ventcel, Krylov all left. They all had to leave, find places elsewhere. Of course, in the early ‘90s, the Soviet empire collapsed and things became really difficult, and people couldn’t make a living, so a lot of mathematicians emigrated to the United States. And but some of them still keep going back. There is the dynamical systems group and the Sinai school which is very strong. So the fact Russian school is there, but I think it’s had its problems in the last twenty years or so. Now that they have the resources, I’m not sure if they’re going to…

TSC: Regroup, or…?

SRSV: Develop again or not…

TSC: What about Japan? How would you comment on the Japanese school of probability?

SRSV: Yeah, They do have a strong school.

TSC: How would you compare the styles of the probability in Japan and for example, France?

SRSV: I think (pause). I think the Japanese school is very good at taking a problem, analyzing it completely. Very quickly really, they can take it all apart. They are very competent. But I also think sometimes that they are somewhat rigid. So they don’t…

TSC: They usually follow some important persons work, they follow it all the way.

SRSV: Well, Ito’s work does not follow anybody (laughs). But I think somehow for them, to be flexible and adventurous is difficult. Personally you see it, and also you see it in terms of their work. Excellent people doing excellent work, but I think they are afraid to explore many things.

TSC: In your work—I find that you seldom quote other people’s work before you. For example, if someone writes, large deviation, people quote your work. But it seems that it started from you, there’s no pioneer before you, you’re doing the pioneering job.

SRSV: That’s not very true. It’s my own fault. I don’t read very well (laughs). And I feel somehow that if you’re going to work on a problem, to read the previous work is a mistake.

TSC: Oh. We should not read yours (laughs).

SRSV: (laughs) No, no, no. What happens then, is that you sort of keep thinking in the same mold. And if you want to break out and think in a different mold, it is good to know what others did, but it’s better not to know the details, because the details, somehow if you think about it too much, you keep thinking on the same route. So if you want to break out in a different direction, I’m not suggesting this for everybody, it’s just the way I feel. So I hardly ever use anyone else’s result. I tend to use fairly simple tools, what I need I prove it for myself. I don’t like to use somebody else’s big theorem and say “this follows from this.”

TSC: That’s very different from what people usually do.

SRSV: Because they read a lot.

TSC: They read too much. (laughs).

SRSV: Has its ups and downs. Ups in the sense that you can have a problem that others cannot solve and you finally solve it, it’s a very different method from what people have done. So it creates a way of doing things. On the other hand it can be very frustrating and one can be stuck on a problem for a long time. If you had read something, maybe you would know what to do.

TSC: Do you talk to other people if you have a problem?

SRSV: Yeah, I talk to people. And I listen to people a lot. As long as I don’t see the details; try to find out what they’ve done. If it’s not something that I have been working on immediately, if I have the time, I will read it and see what they’ve done, what the basic idea is. I don’t have to see all the details. What’s the crucial steps, what’s the idea they have.

TSC: Have you ever made any mistakes when writing a paper? (laughs).

SRSV: As a graduate student, I did.

TSC: Who caught this mistake?

SRSV: I caught it myself.

TSC: You caught it yourself.

SRSV: It taught me something. Some theorem I was trying to prove and I was naïve, and I was just first or second year in my graduate course. I was trying to prove something and I thought I proved it. I wrote it up and I showed it to a colleague. He said it was fine and I sent it to a journal. But then one day I said: why did I prove it? You know, a proof requires an idea. You can’t prove something without an idea. It seemed that I had a proof without an idea—seemed strange. And I went back and looked at it, see what I had done wrong and I realized what I had done wrong. There are two lemmas, lemma A and lemma B. I proved lemma B using lemma A, and proved lemma A using lemma B (laughs). I said: this is not good. You know, if you want your proof to work, before you prove it you have to have an idea on why it works. So I wrote immediately and withdrew the paper. Meanwhile the referee report had come. The referee had rejected the paper. On some other ground and not because the error was found. But it taught me a lesson and now before I start writing a proof, the proof that just comes out of manipulations you can make a mistake. And in a proof where you know why it works, you’re not likely to make a mistake.

TSC: I remember Stephen Orey once said that you are very careful in your proofs. You don’t even have typos. Are you always that careful? Do you always have someone to proofread for you?

SRSV: No it’s not that. It’s the style of writing I think, in the sense that typos I don’t know. It’s the journals, some journal editors go through the paper line by line, the editor actually, they catch the mistakes. Especially if the editor knows mathematics. But you know, when I was undergraduate, there was an analysis exam I think, there was a question: the fractional part of N theta being dense in the unit interval if theta is irrational. So I sort of figured it out, wrote it out. I thought I did a good job, I thought it was correct. Anyway, I submitted it and I didn’t get any credit for it. So I went and complained to the teacher. She said: “You know, may be your proof is correct, you know it, but I can’t see if it’s correct “. And in mathematics, if you’re writing in such a way that the reader cannot see if it’s correct then it’s not.” And then she showed me other student’s answer sheet which she had kept. It was actually Varadrajan’s from three years back. So I looked at that and I could see the difference (laughs).

TSC: I see.

SRSV: So slowly I learned that part of doing mathematics was to learn to write, writing in such a way that to the reader it’s obvious. You don’t try to hide the complicated structures. Sometimes it’s just not possible. Some papers are just so technical. But so you do as good as job you can. I think people often don’t take the care to write properly.

TPL: Can you describe, going back to large deviations, can you talk a bit about large deviations? How you got into it, why is it so important?

SRSV: Ok, so why is large deviations important? If you believe probability is important. Then, you have to calculate the probability of things, or else what’s the use of probability. So, you have a model, and you have an event and you want to calculate the probability of this event in this model, that’s the problem, ok. So if you’re a person working in industry, Wall Street or somewhere, you don’t care you just want a number. And you put that number into your computer and maybe that number is correct, maybe that number is wrong, it’s your bank that’s going to take the hit, not you (laughs). On the other hand, if you’re an applied mathematician, you see the parameter of your problem, and see if you can calculate it as the parameter gets large, you see if you can estimate. Sometimes, when the parameter gets large this probability tends to a definite limit, because the model goes to a different model. And this becomes calculable under the new model. That’s what limit theorems are about. So the problem there becomes computational, because now you have to calculate the probability for the new model. But maybe the new model becomes simpler than the old model, and maybe you can devise numerical programs or simple enough that sometimes exact calculations are sometimes possible. The third thing is maybe when the parameter gets really large, the probability goes to zero or one. Zero and one are the same because you can look at the complimentary event. So let’s say it goes to zero, then you ask yourself: how rapidly is it going to zero? Very often it goes to zero, depending how you parameterize, let’s say it goes to zero exponentially in the parameter and you are interested in what the exponential constant is. How do you calculate this exponential constant? That’s the problem of large deviations. For various models one can calculate this exponential rate and there is a systematic way of doing it. And that’s what large deviation theory is about.

TPL: That’s what you did.

SRSV: That’s what I did. No, but long before me the insurance industry in Scandinavia, started this in the 1930s for a specific model. Cramer did it for sums of independent random variables. So various people have done it, and people in statistical mechanics have done it in a different context. In fact the entire theory of equilibrium statistical mechanics is nothing but large deviations in disguise. So it’s been done…bits and pieces in many places, some of which I didn’t know when I started and discovered later. Some of which I studied as an undergraduate.

TPL: So, there’s a sense of unified approach or, unified thinking?

SRSV: Yeah. It turns out actually that there’s only one formula involved, there’s a universal formula for large deviations. There is a model, usually with lots of parameters. Some parameter is getting large, there might be other parameters around, and so on, ok. When this parameter gets large, the probability of some event goes to zero. You can ask yourself can I change the other parameters a little bit so it no longer goes to zero but goes to a positive limit and perhaps to one. So you’ll ask yourself, ok, suppose there’s a new set of parameters. There’s a parameter that’s going to infinity, in addition there are some parameters which are fixed. Now, the large parameter is still going to go to infinity, but the other parameters are going to change a little bit to push the probability away from zero. Then you have two different models, you can compute the relative entropy of one model with respect to the other that will have a certain behavior as the parameter gets large. Usually linear in the parameter with a constant we can compute. Different perturbations can lead to the same result. You’ll find the most optimal perturbation, with the least relative entropy that gets you out of zero. That is a constant that comes up in the exponential rate and that’s the basic philosophy of large deviations.

TPL: Is it possible at all to say how you came to this realization?

SRSV: Well, working with lots of examples, you know. I tried various examples and eventually I was able to figure out what was common to all these things. Some of them you know, I would just do some calculations and be surprised by the answer. And it made sense because it’s a unifying principle.

SHY: May I ask a question: How do large deviation relate to interacting particle systems? How do you see probability and physics—put them together…

SRSV: Most of the physics of course, unless you have some uncertainty or something that’s not deterministic, you can’t really use probability. If you give me a dynamical system, with given initial conditions, it is just a bunch of ODEs that I need to solve …that gets into chaos theory, and I really don’t know anything about that…that’s different, that has to do with “hyperbolicity” and topological dynamics, that’s a different thing. So I work with models where there’s some noise. I mean, the more noise the better. The class of problems I work with in interactive particle system is called hydrodynamics scaling limits. And what it is, is a very complicated system of interacting particles of huge size but it has conserved quantities. So the system may have a global equilibrium, and because of conservation—because the interaction is local. To reach a global equilibrium, it will take a very long time. So in some time scale, we should be able to explain, describe the state of the system, in terms of local equilibrium expressed in some large spatial scale. So there’s an issue of scales here. So if your timescale is something and your base scale is something else, then you’re describing this with a new scale. That is a macroscopic location and a microscopic location. So macroscopically, your space could be just a compact space. Continuum. But if you look very closely, you can see a lattice inside. And in this lattice there’s an object fluctuating nastily. But if you look at very far away, you don’t see the fluctuations, you only see something smooth, because it’s averaged out. The fluctuating thing will be in some kind of local equilibrium, and what you’ll see very far away is an average of that equilibrium. That local equilibrium could vary from point to point. So what you describe, the state of the system is a function of location. The range of the function being the parameter that describes your equilibrium. That’s the fluid picture so to speak. And then what you’d like to do is, you move time fast enough, then this fluid picture will change and you eventually reach a constant, which will be the global equilibrium. So you want to describe what the equation is that describes how this change and we have a PDE that describes this. And there was originally a Markov process which described it—interacting particle system. So from that, you want to come to the PDE. That’s the problem of hydrodynamic scaling limits. And if you take the probability out of it, you will have a Hamiltonian system that describes a gas and you want to derive the Euler equation and express the pressure as a function of density and temperature through an equation of state that depends on the interaction potential. That’s the equation you end up with, There will be five equations for the five unknown quantities: the density the velocities in the three spatial directions and the temperatures. So that’s the deterministic case which no one can do. But if you have noise, ok, what the noise does for you, it really forces you into local equilibrium. So to begin with, when you know that you have noise, at least under certain conditions, you can really establish in some sense that you have local equilibrium. Then the problem really is to show local equilibrium is not fluctuating. In scales below the scales you’re interested in, because it’s nonlinear and if things fluctuate, then you can’t see what the limiting equation is. Establish local equilibrium and show no unnecessary fluctuations. So you got to make sure that there aren’t any unnecessary fluctuations and that’s part of the problem. Two steps you need to do. You have to establish the local equilibria, and then you have to show that there aren’t any unnecessary fluctuations, and usually the equation that you derive will be in some weak form. These are the three steps in doing it. And there is a systematic way of doing it. You have a reversible system, for example, a particle system in equilibrium. Far away from dynamical systems, which are totally irreversible. If you write down the operator one of them is self-adjoint while the other is skew-adjoint. So they’re extremes. There are particle system models that have reversible dynamics in equilibrium. Then, you can bring in Dirichlet form techniques. See, part of the problem is that your initial states in these problems are far away from equilibrium. Because if you start in equilibrium, you stay in equilibrium and the PDE is that the derivative of the quantity is zero. Because in equilibrium all parameters are constant functions and so you plug it in and zero equals zero. So constants are trivial solutions. So if you start in equilibrium, you get the trivial solution which is no good. So you want to start in non-equilibrium and see how the approach to equilibrium takes place. That’s the whole game. So if you start in non-equilibrium, that means your initial state is some mess. You can’t really solve the Kolmogorov forward equation to say that at time t, my state is this, and for this state I try to prove the various properties. So you have to have an indirect proof, you cannot have a direct proof.

SHY: So you need to prepare the initial state.

SRSV: No! You have to handle it as it is! Even if you prepare the initial state, it’s lost immediately. Unless you prepare it to be constant equilibrium, it doesn’t propagate. Remember, the time scale is huge. So you need a way of controlling it. And you use large deviations to control it. The reason you use large deviations to control it is, no matter what initial state is, its entropy relative to the equilibrium will be proportional to the volume. Because entropy adds up from site to site. So no matter what the initial state is, you can always assume without loss of generality, that the relative entropy with respect to equilibrium is proportional to the volume. Boltzmann’s H-theorem states that the relative entropy is non-increasing. You can calculate its rate of decrease of the relative entropy. You express it in terms of the state, differentiate and use the Kolmogorov equation. Do an integration by parts, which you can do because it’s reversible you end up with the Dirichlet form. In fact a slight modification of the Dirichlet form. The Dirichlet form involves the square of the gradient. Now the density is a positive function and you have square of the gradient divided by the density function. So it’s the gradient squared of the square root of the density. So that’s the Dirichlet form. And the entropy is also positive, you know where you start and you’re always going down, and you have to stay positive. So that gives you a control on the Dirichlet form. Then if you have a control on the Dirichlet form, then you look at the Feynman-Kac formula for expectation of the exponential of the integral of some functions. Feynman-Kac gives you a variational bound for these things in terms of the principle eigenvalue. Because if you look at the rate of growth of the exponential of the integral from zero to t of an additive functional of a Markov process, in the reversible case you can actually control it in terms of the principle eigen -value. The principle eigenvalue has a variational formula. The variational formula is in terms of the Dirichlet form. So if you have a control on the Dirichlet form, you can control these things. But, this is all in equilibrium. So you get very good control in equilibrium first. So bad things can happen. You prove that the probability of bad things happen is very small. How small? Has to be exponential of minus a large constant times the volume. So the error rate is even bigger than the volume.

SHY: But the error is…?

SRSV: The log of the reciprocal of the probability of error, is larger than the volume. Much larger, an order of magnitude larger than the volume. They are called the super exponential estimates. And you can show this by using the calculus of variations, using the Dirichlet form. That’s just a calculus of variation issue, you can do that. Once you do that, then the error probability in equilibrium is very small Then Jensen’s inequality tells you because the relative entropy of the non-equilibrium is only proportional to the volume and the error in equilibrium is super-exponentially small, then the error in non-equilibrium is small. That’s just Jensen’s inequality, a simple inequality that can be used. So that’s how you control the probability of error in non-equilibrium and once you can control the probability of error, the problem is done (pause). Because I never said what the error was! The error by definition is: what you want minus what you have (laughs).

TPL: What you want is the local equilibrium?

SRSV: Local equilibrium, lack of fluctuations, you have to use a different estimate for each one, but it’s the same principle. you control everything in terms of the Dirichlet form. So this works very well for reversible things, or diffusive things generally. But there are other problems. This is in fact the one I like most. See, your time in diffusive problems for spatial sizes n, your time is speeded up by n squared. So when you want to compute the hydrodynamic limit it’s basically a transport problem. So you get the current, if you calculate the current because of conservation law, you get the current as the difference between two things. So there are two powers of n that you have to eat. Because it’s a conversation law and the current is the difference, you can do integration by parts or summation by parts once that gets rid of one n. The other n doesn’t want to go away, unless in some models, it turns out that when you calculate it, it turns out by accident, what you have, is the difference of two things. What you have is mean zero, because by definition it has to have mean zero, in equilibrium. An object that has mean zero under every equilibrium, doesn’t have to be the difference between two things. What do I mean by it? See, you’re on a lattice, you have something that is translation invariant and you have an invariant measure. If you have an object like f(x) - f(T(x)) is always zero because of translation invariance f(x) and f(T(x)) have the same mean. On the other hand this is very good for summation by parts, because if you integrate it, test it against a function that is slowly varying then your rescaling means that your test functions are going to be slowly varying in the atomic scale. So if in the atomic scale it is a difference of two things, and you smooth it out against your test functions you get a free summation by parts, which brings out an extra factor of 1/n. Models with this property are called gradient models. So if you have a gradient model, the method that I just described works fine. But sometimes you get a function that doesn’t want to be written as the difference between two things, and then you have to do something very exotic. You have to do essentially a Hodge lemma in infinity dimensions. So you define an abstract Hilbert space, your objects are functions having mean zero with respect to every equilibrium, an example of that is the current. Because in equilibrium the current has no net flow. Our system is reversible in equilibrium. And then the difference in density between two sites is also mean zero. So if you want to know the transport, then that’s what you want to deal with: the difference of density between two points, because when you plug it in, you get d rho. That’s the gradient of densities so to speak. So what you need to understand is how you can replace a function with mean zero by just density gradient. You have to show that the difference is somehow negligible in some sense. It’s a little complicated to describe without the chalk.

TPL: That’s fine (laughs). That is so much that you have described. Let me go into something lighter. You have been at the Courant Institute for many years. You want to say something about the people there, the place, whatever?

SRSV: Oh, I think…well, you’ve been there, so you know how it is.

TPL: (laughs). But not nearly as much as you…maybe ten percent of the time when you were there.

SRSV: Well, it’s a nice place because it’s like a family in some sense. They take care and nurture their young people very well. The senior faculty always bend over backwards. Another thing is that people are always there and are accessible. Anytime I have a question to ask I can call and ask anybody and they are there every day. Some places, I know if you want to meet somebody and you have to make an appointment a month in advance (laughs). Here people are there every day, they’re at tea, they’re at lunch, you ask the things you want in a conversation. And another thing, the Institute has been functioning fairly independently of the university for a long time. So they have been able to function without much bureaucracy…that’s changing now, however. I think like most American universities, NYU is also becoming more like a corporation. The presidents like to think of themselves as CEOs. It’s become a business. I don’t blame them, because the government support for education is dwindling. So they have to run it like a business in order to raise money to support their activities.

TPL: Also the emphasis of Courant Institute has changed over the years—the field of study.

SRSV: That’s because the mathematical scene is evolving. 30-40 years ago it was important to study elliptic equations, pseudo-differential operators and things like that. Now it is more non-linear, nonlinear hyperbolic equations is what people are looking at. I think what’s current in mathematics also changes and hopefully we change with it, otherwise we end up behind. And also mathematics has become more computational these days, large scale computation has become an essential tool in mathematics. With larger and larger computers you can do more and more, who knows when quantum computing comes, you can do even more (laughs).

TPL: I was told that this will not happen very soon (laughs).

SRSV: That’s what they said about going to the Moon (laughs).

TPL: Alright, I see (laughs). You have a connection with India much these days?

SRSV: Yes, I try to go once a year or so. Not to actually go and stay and visit one place, but spend a holiday, just to be in touch with the various places. This year in February I’ll probably spend a month in Chennai at the institute there. I think they were interested in having some probability program. Stroock was there last month for a month, and I’m going in February to continue from where he left off. Hopefully he’ll go back next fall and I’ll go back the spring after, so, there will be four 1-month courses which will be a continuous stream.

TPL: With India’s economy improving rather substantially, naturally Indian science will also go up.

SRSV: I hope so, I hope so, but that’s not given, because unfortunately Indian education is very poor in science. University education is controlled by the states, and the states have their own agenda. Although there are a few people who are interested, they can’t do everything, you need to have people at all levels cooperating and that’s very difficult. So what will happen is that the central government will start new institutions in a few places, and they’ll be good, because they’ll keep the state governments out of it. But once the state governments get involved it will be a mess.

TPL: But when you were young there…

SRSV: It was not politicized.

TPL: I see.

SRSV: You know, it turns out if you’re from one state, you can’t be hired as a professor in another state. If you don’t get a PhD from a professor at the same University, you can’t be hired. Inbreeding is enormous.

TPL: But in Indian culture, people like very much intellectual pursuits.

SRSV: That’s what we keep saying (laughs). Here is the point. It’s true, but people are…see, most people come from a middle class background. This kind of wealth you see now is very recent. So when you come from a middle class background, your emphasis from day one is that education is a way to get rich—or to get up the ladder of wealth. So economic benefit is one thing you are looking for in education. So, you look around and you see that IIT is your best bet. So the top students take the competitive exams, pass it and get admitted to the IIT. And they get their degree and they do very well and the top amongst them, go to the United States to get a PhD, masters or whatever, settle down in Silicon Valley. Some reach high levels in corporate management. And some of them go back, you know, I’m not saying they don’t. But the cream of the crop does not stay in India and do basic science is what I want to say. There are of course exceptions, there are a few people. But you cannot staff the next generation of college faculty with a few exceptions. So if the government decides tomorrow to start wonderful new universities—twenty different universities in India, where is the faculty coming from?

TPL: It’s a slow process.

SRSV: Takes two or three generations to build it up.

TPL: It seems to have begun to happen.

SRSV: It will happen! I’m sure it will happen, because the U.S may not be able to hold on to these people (laughs).The U.S. economy may be on its way down, unless the government changes its ways and acts more responsibly. The national debt keeps going up by trillions. The current leadership will not repay it. Their children and grandchildren are going to have to pay. What I’m saying is that there are forces that are making in more attractive for people to return to India, that’s what I’m saying.

CRH: I was thinking about because you received the Abel prize, with that clout, could you talk to the prime minister of India, try to influence certain things?

SRSV: No, I don’t.

TSC: Maybe in Chennai.

CRH: You can do something.

SRSV: No, no, I can do something in the sense that I can do something locally. They may ask me to be a member of some committee or whatever, and I can put in my two cents worth. If they want, they can follow it, otherwise they can ignore it (laughs). Take the simple issue of talking to the United States about the nuclear plan. It’s clear that energy is going to be important for India, and you can’t expect to grow at the rate we are growing. They’re talking about putting a million cars on the road every year (laughs). Of course there are a lot of people with cars and a lot of people without. People who don’t, want to have the same privilege as those who do. And economically if they can afford it, who can stop them? Should the government decide that it is good to have nuclear reactors, develop nuclear energy? You have to be careful. But I think that it is an alternative that every country has to face. France has been doing it for years. Don’t know if Taiwan is?

TPL: Yeah, there’s a lot of debate on that.

SRSV: You can’t afford not to, otherwise you’re working very hard to send the money to somewhere else!

TSC: Dubai.

SRSV: Russia even in these days. But on the other hand, the political system is such that the party in power is a coalition and the leftists are there. The Leftists are the Communist Party of India, and they want to out-do Mao (laughs).

TPL: It just can’t be done (laughs).

SRSV: So they’re still clinging to the idea that India‘s agreement with the United States is a no. So they’re refusing to allow the government to negotiate, so I think the deal will fall through. Do they care what’s going to happen to the energy problem in India? No, they don’t. They only care about their ideology, and they want to maintain their ideology, so they will go and tell their constituents that they defeated the infidel United States from coming in and they’ll get some votes and get reelected, so that’s what they care about. That mentality is there throughout and unless that changes, mere economic gain is not going to radically change the outcome. Some people will get rich and live well, but that doesn’t really solve the problem.

TSC: That’s very different from Taiwan. You could get the Abel prize, talk to the president directly.

CRH: You mean the prime minister, because…

SRSV: No, well, people are nice. The prime minister sent me a congratulatory telegram immediately. That’s…you know, they tell you nice things: “The next time you come to India…” and so on, but that’s the end of it (laughs).

TPL: In some sense they have a point. Running a country is a complex thing.

SRSV: Sure, sure! In the end they have to do what they have to do.

TPL: But still, respect for the intellectual, in whatever the consequence is, or whatever way it is carried out that’s another matter, but respect for the intellectual is there, that’s valuable I think.

SRSV: You know one thought was, now that the middle class is getting richer and can afford it, maybe you can ask some of the foreign countries to start private universities in India. But I think there’s a lot of opposition to it. They don’t want cultural imperialism.

TPL: I see. There are always possibilities…

SRSV: Well, I think eventually the problem will be solved, right? There will be enough demand for…I think what will happen is, the Indian industries, the IT industry and so on, all of them, still depend on the West for their research; and they’re very good at taking off from the prototype, and doing things. But of course that’s not where the money is. Real money is in having patents, and that requires one’s own research, otherwise you’re just a sweatshop for somebody else, I think people understand the importance of that. So some of the industrialists are starting their own research labs. As this culture develops, they will demand people to work for them and then government will respond: “You need more biologists, ok, so we’ll train more biologist.” So at the moment, you can’t just train four excellent people to do research, because four excellent people to do research will come if you develop four hundred scientists. But for four hundred scientists there are no jobs! Then they won’t go in. So that’s the stage we are in right now. There must first be a demand for scientists from the general industry and then the progress in basic sciences will happen.

TPL: Sounds like how the research is developing in the first place, right? Ok, I guess…thanks very much. I can see very much that you’re very concerned about many aspects of the society. That’s great! Come back often to Taiwan.

SRSV: Thank you.

  • Tai-Ping Liu, Chii-Ruey Hwang and Tzuu-Shuh Chiang are faculty members at the Institute of Mathematics, Academia Sinica.
  • Shih-Hsien Yu was a faculty member at the City University of Hong Kong and is a faculty member at Academia Sinica starting July 2021.