Archive for December, 2005

The Great Divide

A friend of mine, who studied Computer Science in IIT Madras [while I was in Electrical Engineering, consciously avoiding programming as much as I could] and later got a PhD in the subject told me, “You lucked out. If you had done Computer Science then, you would be in academia writing papers, not creating software.”

It was funny he said that, because once upon a time I really, badly wanted to be in academic research, proving theorems, writing papers. That is what I did for my PhD in Information Theory. Then towards the end of my PhD I had an epiphany, a religious conversion experience, if you will [for a guy who is not religious, I sure have had a lot of religious experiences!] and moved to industry. That path eventually led to software.

I agree with my friend. There is this great divide between these two worlds. I saw that when I worked on CDMA in Qualcomm. See my earlier post Two Philosophies in CDMA: A Stroll Down Memory Lane for some conclusions I reached. The interesting part of it was that I never worked on CDMA for my PhD research, even though my advisor was the leading figure in the field. It wasn’t part of any plan, CDMA seemed too practical, so I naturally gravitated to more esoteric theory [I told you I was like that]. It fit well with my world-view at that time.

And when I finally moved to industry, my not having worked on CDMA in my PhD proved to be a blessing. I could learn the practice of it unfettered by any preconceived ideas, then go back and read the academic literature, and make my own judgement. It was just coincidental that my conclusions happened to be unfavorable to academic research, and even more coincidental that my advisor was the leading figure on the academic side of it. I just called it as I saw it.

Anyway, that is just so much history, not relevant to my current passion, software. Yet, as I have gotten deeper into software, I am starting to read more of the academic Computer Science literature. I have tried hard not to let the previous history [in a different field] bias me, but to be honest, I have not been very impressed with much of the work going on in academia.

In software, abstractions and design patterns are very important. They let you solve a problem once, and use that solution in related contexts. Abstractions promote more expressive, high level software, making the job of the human programmer easier. Ultimately the purpose of programming languages is to make the human programmer more productive, allowing him or her to deliver software faster, with higher quality.

So it is natural that I find myself spending more and more of my time with programming languages. At least that’s my official excuse. The last 10 years have seen a veritable explosion of languages. Many of them have gained popularity, and the field is vibrant, with new developments like Ruby on Rails [technically a framework, not a new language, but frameworks and languages are closely related]. The productivity impact of these developments on the practicing software engineer has been enormous.

Java alone has saved untold hours to countless engineers in testing, debugging and porting software. It has made programming easier and more accessible to more people than its predecessors. Languages like Python and Ruby have continued innovating, and it is possible Ruby & Rails may deliver even more productivity.

Yet, what is the academic programming language literature talk about? Type systems, monads, category theory, … Not that there is anything wrong with studying theory, but even for someone with as much mathematical background as me, the stuff is just mind-numbing to read. In other words, a lot of it reads like my PhD thesis [there, I said it, I have let my history bias me!]

I think I know the reason: even when something can be stated simply, in plain words, academics have a preference to state it in estoeric terms. It feels good to talk a language that most mortals won’t understand, we get to feel special. It is not because the stuff cannot be stated any simpler. A lot of that comes from insecurity too. If ordinary folks can understand the stuff, may be they will realize it is not as impressive as it sounds. Frankly, most of the academic literature is not all that impressive, if you strip the jargon away and state it in plain language. And papers won’t be accepted in pretigious journals if they don’t have the requisite amount of esoteric but often useless jargon.

I know the typical paper-publishing pattern: take a simple idea, clothe it up in a lot of mathematics, make it sound as impressive as you possibly can, to the point that the simple idea is now lost in a whole mass of mathematical jargon. It is often the case that the core idea has no merit at all, in which case it is all the more important to pretty it up mathematically - that pig really needs all the lipstick you can put on it. By whatever means, the paper has to be published. How else do we get grants and PhDs and that all important tenure? I remember thinking if only the taxpayers who fund a lot of these paper-mill adventures knew where their tax dollars [and Rupees and Euros …] were going …

Religion offers interesing parallels. The old Brahminical religion emphasized Sanskrit and it was confined to the priestly class. Buddha taught his message in Pali, the language of the layman in his day. Jesus was a revolutionary. Their evangelical, universalist ideas changed the world. The academic priesthood will do well to remember that religious history. Then again, it is not exactly in the interest of the priesthood to let the layman to find out what is going on, is it?

To quote Richard Feynman

Quote:
If you’re representing yourself as a scientist, then you should explain to the layman what you’re doing …

And this from Albert Einstein

Quote:
You do not really understand something unless you can explain it to your grandmother.

I have practiced this. I have held forth on software concepts to poor unsuspecting relatives and friends, usually young cousins who won’t make a quick get away. My poor wife has been on the receiving end of more database concepts than she would care to remember. But the experience has been valuable [to me, that is]. When I find that I am not able to explain something well, that is almost always because I haven’t understood it myself.

Programming languages are not rocket science - heck, rocket science aint rocket science. The vast majority of programming languages in regular use have been created by programmers, not Computer Science Professors.

Two Philosophies in CDMA: A Stroll Down Memory Lane

I have now worked a little over 10 years in the industry, after getting my PhD. In my very first year of work at Qualcomm, I noticed how even when speaking about the same subject, namely CDMA, academia and industry were on totally different planets. When I was in Qualcomm, I co-authored a paper with Dr. Viterbi, titled Two Different Philosophies in CDMA, A Comparison. I still stand by the conclusions we reached, though there was a fairly strong rebuttal from my PhD advisor Prof. Sergio Verdu. His paper is listed in http://www.princeton.edu/~verdu/mud.html and I had located a scanned copy somewhere that I don’t remember now. I will post a link soon. A somewhat related paper by Verdu is http://web.mit.edu/6.933/www/Fall2001/Shannon2.pdf

A very quick summary of the conclusions we reached 10 years ago:

1. Multi-user interference is a major problem in CDMA. Your first line of defence against it is a powerful error-correcting code. You can potentially supplement it with non-linear techniques like successive decoding and successive interference cancellation, but they suffer from parameter estimation errors, so are not [as of 1995] yet anywhere close to being practical. Instead of non-linear techniques, our focus in the paper was on linear techniques common in academia.

2. A second line of defence had been very common in the literature, indeed the exclusive focus of study in the academic literature [true in 1995], namely linear dimensional techniques. Many people superficially claimed these techniques were applicable to commercial Qualcomm CDMA systems then already in the market. They were not. The fundamental system model the academic community used to study their interference suppression techniques differs markedly from Qualcomm CDMA. The systems were philosophically different, and the academic CDMA model was clearly inferior from an engineering point of view.

In spite of Verdu’s rebuttal, these conclusions have held up remarkably well. Today, the error-control code has gotten even more powerful: turbo-codes, which have caused a revolution. The dimensional techniques have been progressively abandoned. It vindicates my conviction that the academic community was barking up the wrong tree, and what they were calling CDMA was not the same CDMA the world has come to know and love and it was clearly inferior as an engineering design. Strong words, right?

A recent overview by Jeff Andrews at UT Austin Interference Cancellation for Cellular Systems: A Contemporary Overview caught my eye, and it referred to that 10 year old controvery (the reference [18] in the quote below is my paper and [19] is my advisor’s rebuttal)


Quote:

A controversial paper from QUALCOMM [18] identified at an early stage many of the fundamental problems with academic research on MUD, particularly with the widely researched linear projection techniques of the early to mid-1990s. Although this article made a number of mistakes (as partially documented in [19]), it may have helped spur more intensive research on interference cancellation from the late 1990s to the present. We now summarize the key challenges and historical shortcomings in multi-user receiver implementation, some of which were identified in [18], but will revisit some of these issues later when forecasting a bright future for multi-user reception.

Certainly, the “controversial” part is conceded. I was basically committing academic patricide, criticizing the very work that made my advisor famous. But I do stand by the essence of my critique. After all, Jeff Andrews states that my paper “identified many of the fundamental problems with academic research on MUD” - and the rebuttal by my advisor doesn’t really address the central thrust of my paper: namely that there is a fundamental difference in the mathematical system model studied as CDMA by the academia and the one built by Qualcomm and the academic model leads to misleading conclusions. My advisor has been “Dr.CDMA” in academia while what I consider the “real” CDMA (a fair characterization, because Qualcomm-is-CDMA, to this day) has been practiced in the industry. It is not an academic (!) debate: CDMA is now the dominant standard in cellular communications, and barring a challenge-from-below from Wi-Fi/Wi-Max, it should rule the roost for a long time.

We called the academic CDMA D-CDMA (for “dimensional CDMA”, and you can also consider it “delusional CDMA” - just kidding!) and the industry CDMA as R-CDMA (for “random CDMA” or “real CDMA” if you will). D-CDMA system model assigns signature waveforms (dimensions) to users, and deals with their correlations at the receiver using a variety of linear and non-linear algorithms . If the signature waveforms are orthogonal (the ideal situation), then D-CDMA is theoretically equivalent to TDMA. D-CDMA system model simply did not assign any role to error-correcting codes, a significant omission [see below]. Certainly, no paper by my advisor or collaborators until that point mentioned error-correcting codes, and it is fair to say that the D-CDMA model popularized by Dr.CDMA remained by far the most popular model in academia. I contended then, and stand by it now, that this is a significant problem.

In contrast to D-CDMA, R-CDMA is a deceptively simple, even simplistic idea: simply randomize all the users with very long spreading codes, so that every other user looks like Gaussian noise, then use powerful error-correcting codes against that noise + noise-like interference. It can be improved by the well known technique of successive interference cancellation (which my advisor called a proof technique, and suboptimal in practice, in his rebuttal but it is certainly more practical - but not very practical yet! - than many of the algorithms common in the D-CDMA literature).

Qualcomm’s CDMA system, designed based on a very different system model than what was considered in academia, was already in the market when we wrote the paper, so the fact that the academics were still researching their clearly inferior system model that treated demodulation/decoding separately, illustrates how ostrich-like academia often becomes, and why I decided to leave. Papers were published, tenures granted, reputations and careers made, and yet, the basic system model just didn’t make sense, at least to me.

In essence, our critique was not about multi-user demodulation per se but the incorrect system model adopted by the academic community to analyze “CDMA”. The D-CDMA model simply took a very statistical detection theoretic view of the world, while ignoring the fundamental role played by coding techniques. In any practical (or theoretical!) communication receiver, statistical detection (matched filtering and other statistical signal processing like parameter estimation) and decoding work together. Any system model and analysis technique that does not consider the system holistically is prone to reach erroneous or at the very least, misleading conclusions. Yet, the vast D-CDMA academic literature was full of such statistical-detection-based analysis (they call it “demodulation”, to separate it from “decoding”). A telling sign is the emphasis on SNR in the D-CDMA literature, a notion useful in analog parameter estimation, rather than the more appropriate Eb/No, a notion that is only relevant to digital communication. In fact, D-CDMA literature, including almost every paper from my advisor, made the cardinal sin of evaluating digital communications systems based on the uncoded bit error rate. The effect of that is that these systems do not display the characteristic “cliff” whereby the bit error falls steeply when a threshold Eb/No is crossed - that cliff makes analytical methods intractable, so simulation is all but required. All in all, my advisor’s rebuttal essentially missed the central point of my paper, which I believe was reasonably well articulated.

Jeff Andrews provides further confirmation of my critique. To quote from his overview again (emphasis mine)


Quote:

For asynchronous systems like the cellular uplink with random user timing offsets, mutually orthogonal codes are not feasible. But again, multi-user receivers that attempt to project the received users into orthogonal dimensions do not stand up to careful scrutiny. In particular, most of these techniques require the construction of a code cross-correlation matrix. But for the long PN codes generally desired for multi-path properties, this matrix must usually be recomputed every symbol, which is impractical. Even more important, it is now well established that strong error correction codes (ECCs) should be used in the uplink (and the downlink, for that matter). The spreading gain of these codes is lost on dimensional multi-user detectors. Furthermore, while the goal of these codes is to lower the required received signal-to-interference-plus-noise ratio (SINR), this complicates channel estimation and other aspects of dimensional MUD. Since linear projection receivers require spreading to be performed by a linear operation, there is direct competition between ECCs and dimensional (linear) MUD [24]. However, as discussed next, interference cancelling multi-user receivers not only do not suffer from this trade-off, but in fact need ECCs to attain their high performance, a result supported by information theory [25–27].

Most researchers have come to recognize that in a highly frequency-selective and rapidly changing channel with many users, dimensional separation techniques at the receiver are not viable. Further, it has been realized that effective MUD designs must integrate error correction coding/decoding into the receiver structure, rather than treating coding as a separately concatenated block. For this reason, successive [30–32], parallel [33, 34], and iterative interference cancellation (turbo MUD) [35–38] appear quite attractive.

A one-line summary of our paper would be precisely that dimensional-separation techniques at the receiver are not viable. Thank you Jeff, for making that point! I wonder if Prof. Verdu accepts your conclusion, because his rebuttal was still optimistic about those techniques. If he does accept this conclusion, it would be nice to receive an acknowledgement from him, that, yes, his student got that one right.

The reason for the love affair with statistical detection, to the exclusion of decoding, is the ivy-league academic quest for “closed form” mathematical results, at the expense of any kind of real world relevance. If your PhD doesn’t have the requisite number of Lemmas and Theorems, forget it - that seems to be the attitude. Turbo codes left leading coding theorists in the dust, with a radically simple and practical approach, which essentially achieves Shannon capacity (and that was only “proved” through simulations, no theorems, leading to enormous scepticism about the work initially). My advisor, a closet mathematician, always loved the closed form, theorem proof approach himself, and essentially prohibited me from doing any kind of “simulation work” to get a PhD. To be perfectly fair to him, that fit very well (too well!) with my own IIT Brahmin attitude about not getting hands dirty with something as mundane as writing code, an attitude I got out of after I had an epiphany that my entire PhD was useless abstractions piled on top of useless abstractions. That was what led me to abrubtly end my academic career. Looking back after 10 years, that was the single best career decision I made in my life.

When I entered Qualcomm I realized what a valuable tool the computer is to study real world systems, and how limiting the quest for “closed form” can be. In fact, my introduction to the world of software arose from modeling and simulating communication receivers, and I ended up embracing software as a career eventually. I read somewhere that Wolfram quit academia and went into software for what sounded to me (may be I imagined it!) like similar reasons: the Wolfram thesis asserts that the universe is equivalent to a Turing Machine and computer programs are the best tools to study the universe. Wolfram seems to deemphasize the theorem-proof methodology himself - in fact, he asserts in his book that most interesting questions are logically undecidable (i.e unprovable). I agree.

This is not to criticize mathematics; indeed far from it. Simple analytically tractable models (i.e the closed form again!) are illuminating, but they cannot substitute for more realistic models that can so often only be simulated. In particular, the holistic demodulation/decoding system cannot really be handled in any closed form solution I know of. Often, simulation can and does lead to insights into theory and can open theoretical doors that we may not suspect existed.

The reason I am writing this is to set the record straight. I have long ago moved on, and have transferred my affections to software. CDMA is like a long-lost old flame for me - it still excites some passion, but I am happily married to software now ;-)

With the benefit of hindsight, read our paper, and read my advisor’s rebuttal. Then read Andrews’ recent survey. See if the points we made about the flawed system model that is D-CDMA resonate, 10 years later. As for the “mistakes” pointed out by my advisor? Just one sample would suffice. His rebuttal [see section 2.5 in his paper] quotes the “misconception 2.20″ (not specifically attributed to our paper, though the implication is there) that “Multi-user Detection is Successive Cancellation”. We didn’t say that - what we actually said was that if you want real multi-user detection, there is always successive cancellation, a technique well-known and something Dr. Viterbi, my co-author was well-aware of, and had written about long ago in the CDMA (R-CDMA) context. There are still numerous practical difficulties with it, having to do with the fact that various parameters have to be estimated in a noisy changing environment. Guess what? Andrews optimism for MUD is now based on this same technique, dubbed as “misconception” by my advisor. Deja vu all over again …

Here is a pure speculation/suggested PhD thesis: in statistical parameter estimation, there is the Cramer-Rao bound (it is a sort of the analog-equivalent of Shannon capacity, which is a digital construct). It basically says that your accuracy of estimation is limited by the variance of the noise. I have a strong suspicion [I am willing to stand corrected if I see some good results] that the regime of signal to noise you operate in CDMA makes any kind of accurate parameter estimation, which is needed for almost all of these multi-user techniques, including successive cancellation, quite difficult. There is an interplay and tension between bit rate maximization (digital communication) and accurate parameter estimation (analog communication, you assume that the channel is “communicating” those parameters to you!) that I have long wondered about, but done absolutely nothing more than wonder about. Note that if the parameters to be estimated don’t change much with time, you can average over a long interval, so you can do OK. But the real world is not so nice.

Basically after a long detour into inapplicable techniques arising from a faulty system model, academia is finally getting to a point that some of us in the industry reached 10 years ago. Welcome to the real world, guys.

What is Declarative Programming - Part 2

The previous thread What is Declarative Programming got too big, so I am splitting it up. An Old Acquaintance said

Quote:
My belief is that HTML + interpreter (i.e. browser; no pun intended there) isnt a meaningful model of computation at all.

As an aside: declarative progamming means just stating the truths or the invariants and the interpreter computes logical consequences thereof. This computation can be channeled into solving problems (how the interpreter goes about this search is not important to the person writing the declarative programs; they just declare).

The HTML + browser combination does satisfy what you say in the second para there. I simply declare I want a title, and it figures out how to do it. Think about the million details it abstracted from me. The display device could have varying resolutions. The default font could differ. The operating systems are very different. Yet just a simple declaration of title does the job.

It is a very meaningful model of computation, when you consider the sheer number of details it abstracted away from you, to be performed by the browser. The reason it doesn’t appear so meaningful to you may be that its output of the computation goes to the visual display device.

Quote:
Actually you are confusing two things:

1) Levels of abstraction (high-level, assembly, gates, transistor, string theory..)

2) Models of computation (declarative, imperative etc.)

I disagree. I would say what you call “models of computation” is a very imprecise notion. Fundamentally it resolves to a syntactic notion, in my opinion. Take a well designed library that insulates the user from the details of the underlying implementation, so the library user simply declares what he wants done, and the library figures out how to do it. If you restrict your computational power, so that you are restricted to using just those library primitives, and the rest of the language isn’t Turing-complete, then your code is, at some level “declarative”. Note that the implementation of the libary itself may need a Turing-complete language, but the library user is restricted from that.

That is what HTML is doing here. The contract is the Document Model (DOM). You are allowed to use those DOM primitives (aka HTML tags) but nothing else. It definitely is easy, a lot easier than creating the implementation that interprets those tags. It is technically possible (indeed that is what Gecko layout engine from Mozilla does) to provide the DOM itself as a C++ API. This allows you to embed the DOM API in your code. In this case, you have “imperative” C++ code calling “declarative” DOM C++ APIs. The reason why this is harder than HTML design is that we would have mixed up the purely “declarative” DOM-API-calling C++ code (i.e the equivalent of HTML code) in all the other general C++ code. The divide-and-conquer strategy to master complexity has been lost.

Note that Javascript + HTML is _exactly_ analogous to that C++ with DOM API, except that the Javascript gets access to the DOM through the HTML tags, and not directly with the C++ API. They live in distinct syntactic spaces, enforcing the separation between the “declarative” and “imperative”.

Now, what is a library and what is a language feature is a distinction without a difference. In languages like Lisp, there is not even a syntactic difference.

I venture that what is traditionally known as “functional style” (again an imprecise notion, mostly overlapping with the “declarative style̶ ;) really amounts to writing well-designed libraries and macros, and then gluing them together in a “declarative” fashion. It would be possible to change the small parts sitting in the libraries (the implementation) without changing the glue that spans across them. It is bottom up in terms of development (libraries/macros and then the glue) but you can understand it top-down. In general, library/macro writers (equivalent to browser writers) need to be more skilled in programming than the declarative glue writers, who would need to be skilled mainly in the “domain” (User interface design in the HTML case).

But since Lisp encourages the whole code to sit in a common syntactic space, we can get into “logically undecidable” disputes. Two Lisp gurus can select differing ways of partitioning of what is library and what is declarative glue, and both would be “right”. Selection of what primitives to offer in a library is akin to selecting which set of axioms are more fundamental, which ones should be derived. In practice, good libraries give a redundant set of primitives (so there is more than one way to do something in the glue layer above), for syntactic convenience. HTML and LaTex do similar work, and both are declarative, but they don’t standardize the same primitives.

Frameworks (another fancy word for libraries) like J2EE or Ruby-on-Rails, and languages like HTML, standardize the library layer. And with time, more and more of what was previously imperative glue migrates into the library, making more and more of the resulting glue declarative.

What is Declarative Programming?

The Wikipedia entry Declarative Programming is a good place to start. To quote from that entry

Quote:
… declarative programming is interested in the what, but leaves the how up to interpretation. This is why declarative solutions are said to have two independent phases: declaration and interpretation.Declarative languages describe relationships between variables in terms of functions, inference rules, or term-rewriting rules. The language executor (an interpreter or compiler) applies a fixed algorithm to these relations to produce a result.

Among the languages listed as being declarative in the Wikipedia are Haskell, SQL, WSDL and XML style-sheet transformation. HTML would qualify too. As we can see from that list, there is nothing obviously in common across these languages (consider SQL vs XSLT for example) except that they are all Turing-incomplete. From the Wikipedia entry again

Quote:
The major disadvantage of declarative programming is that it is not Turing-complete � an imperative language can solve any computable problem, but a declarative language can solve only a narrow subset of special problems (those that the very non-universal engine was designed to solve). For example, among imperative languages, Cobol or assembly language can, however awkwardly, solve any problem solvable by Java or Ruby; but any particular declarative language can only solve a problem of optimization, of business rules, of data storage, of data access, of formal logic, or of some small and miscellaneous category.

But I don’t view the Turing incompleteness as a disadvantage at all. In my earlier post HTML + Javascript: The Benfits of Turing Incompleteness I discuss how combining a Turing incomplete language (HTML) with a Turing complete language (Javascript) results in a better solution than doing everything in Javascript alone (which is technically possible, of course). The HTML interpreter in the browser lets us specify the UI more easily and at a higher level of abstraction. In effect, the HTML is acting like a declarative JFC or MFC API (APIs are declarative sub-languages, a subject for another day!). But unlike the situation with JFC or MFC APIs, the HTML sits in a separatic syntactic space from the Javascript, and that separation proves crucial. Typically a visual aesthetics person designs the HTML while a programmer codes up the Javascript and both are happy.

It is the same story with SQL and Java/C#/PHP/Ruby - the SQL sits in a separate syntactic space, so queries can be optimized after-the-fact by a trained database administrator.

OK coming to some questions I have: Is there a mathematically rigorous definition of declarative programming or is it an imprecise human notion whose only real mathematical property is that it is Turing-incomplete? But that is not a very satisfying definition. And to add to the difficulty, would any Turing-incomplete language qualify as being declarative? Very likely not, yet, any given Turing-incomplete language may be equivalent to or be transformable into a declarative language.

If there is indeed a mathematical definition, is there an algorithm for deciding when a language is declarative? I suspect not.

Apart from their relative ease, one interesting property of declarative languages is that they are provably halting (and therefore provably secure). HTML and SQL are both secure (provably secure?). Could this be the basis of an operational definition of declarative - that every program written in a declarative language provably halts?

HTML + Javascript: Benefits of Turing-Incompleteness

What is the most popular language today? HTML. Spreadsheets are another very popular style of programming. SQL is enduringly popular, in spite of the cognoscenti writing off relational databases as “legacy” - something I vehemently disagree with, by the way.

What is in common across these programming systems? They are each optimized for what they do, and more importantly, they don’t attempt to do everything. In programming language terms, they are Turing-incomplete i.e there are many kinds of programs you cannot write in those languages. Languages like C or Java or Lisp or Ruby are general purpose “universal” languages, while HTML (without Javascript) is limited to the purpose of sending a document for display on a browser. SQL (without the vendor proprietary extension languages like PL-SQL) serves to query relational databases. Spreadsheets (without macros) allow you to do rapidly build a variety of number crunching applications.

When we need the full power, we combine HTML with a Turing complete language Javascript. Technically HTML + Javascript is equivalent in computing power to Javascript alone, but that is no reason to discard HTML. Same way we embed SQL in a Turing complete language like C or Java or equivalently, use something like PL-SQL or T-SQL to build applications.

What the success of these ideas shows is the power of combining a Turing incomplete “domain specific” language like HTML or SQL, and embedding or augmenting it with a universal language. The resulting solution turns out to be far simpler (for the humans) than writing everything in a universal language. It is the old divide-and-conquer strategy at work.

HTML and SQL are known as declarative languages - we just tell the computer what we want, without specifying how it gets done. By adopting the HTML + Javascript or SQL + Java solution, in effect we extract a declarative core of the application and separate it from the rest of the program. Declarative programs are easier for the human to write as well as understand - it is always easy to declare “I want a cup of coffee with milk and sugar” rather than specify the steps involved with “Grind the coffee beans, put it in the filter, ….” One problem with specifying all the steps is that the coffee-maker now has no flexibility - for example, it cannot pre-grind the beans in anticipation.

Keeping it declarative is not only simpler for the programmer, but it also allows the run-time system to optimize it. SQL databases optimize queries, and browsers cache HTML pages.

A good part of what is known as “functional style” is really about developing such declarative sub-languages. Lisp macros enable that style. Relational queries follow the same declarative style.

Eric Janszen’s Article in Always On

I read a brilliant article written by Eric Janszen, CEO of Autocell, titled The Big Bet Bet in the Always On site. It captures the damage being done by the easy-credit, “Greenspan Put” (I guess soon to be the Bernanke Put) policies of the Federal reserve to the long term health of the American economy. To summarize: financial activities are prospering, while real productive activities are struggling. Inequality is rising. None of this is the result of capitalism or the free market, as some foolish economists will have us believe. To get a sample of mainstream macro-economics thinking, see http://blogs.adventnet.com/svembu/2005/11/26/joining-the-quotreal-worldquot/

The libertarian Austrian School economists have long warned of the consequences of inflationist monetary policy. What we are witnessing in America is a Federal Reserve-induced perversion of capitalism and free markets. The mainstream completely misses the debt-induced nature of America’s “growth” and the long term erosion of competitiveness. It falsely attributes the loss of productive capacity, the stagnation of incomes at the lower levels and the rising inequality, to globalization, when in reality it is the financialization of the economy that is causing this (Germany and Japan, even with numerous other problems, have retained their engineering productive edge).

I am afraid that the Federal Reserve now poses a risk to capitalism itself, because if this goes on unchecked, populists of all stripes will emerge to sell their usual socialist quack remedies, repackaging them under different names.

Here is an excerpt from Janszen’s article (highlights mine):

Quote:
All these [business] leaders understand, but never admit, that the motivation and incentive for Americans to resolve these critical problems�to improve our education, healthcare, and energy systems; to control our debts, live within our means, and so on�have been gradually reduced by the U.S.-dominated global speculative financial system that they themselves have helped create.Welch of GE, in his book Winning, describes how pleasantly surprised he was to learn (in the late 1970s and early 1980s) how easy it was to make money in financial services as he sold or shut down GE’s industrial businesses. He not only became very wealthy by doing so, but also came to symbolize an entire era.

The reality is: why should Gerstner really care to do anything about education, when his private equity firm can buy a public company, “fix” it via financial leverage and layoffs, and then sell it for a quick profit to individuals and pension funds made both desperate for yield (by years of accommodative Fed rate policy) and oblivious to risk (from decades of Fed bailouts every time a poorly conceived, high-risk investment failure threatens the so-called “real economy”)?

For that matter, why should ambitious American students study physics, engineering, and math (except perhaps to become Wall Street “quants”), when it is so much easier to make money speculating in real estate, stocks, and everything else, including untenable IPOs for companies that are unlikely to ever be profitable?

Why should corporate CEOs worry about American education, science, and technology, when they can simply “downsize” and “outsource” to Asia, greatly enhancing the value of their stock options and, at least in the short term, shareholder value?

Why should any corporate leader be surprised that, 25 years after the magic of making masses of money from government-protected leverage and guaranteed liquidity, the process has come full circle? The financial corporations now dominate the private sector’s ability to generate profits. At the same time, we are witnessing the slow death of industrial giants like General Motors, which for decades stood as the world’s premier industrial corporation.

For obvious reasons, no one in power ever makes this simple link between the U.S.-dominated global speculative financial system and the diminishing motive for Americans to innovate and create new goods and services. This is the reason why clearly needed economic, social, and political changes are not forthcoming. Instead, the so-called solutions to the problem have become a shell game of retraining and social safety nets.

While the global speculative financial system shifts U.S. assets and productive capacity (and thus wealth-generation capabilities) overseas, the average U.S. household is losing its ability to generate real income in the domestic economy. About 80% of the U.S. labor force has seen real weekly earnings decline 16% over the past 33 years, according to government statistics.

Meanwhile, the global speculative financial system has pushed income inequality to all-time highs, with over 50 percent of 2004 income going to the top fifth of U.S. households, and the biggest gains going to the top one percent.

But printing money to inflate asset values creates no new real wealth. Econ 101 taught us that true wealth creation can only come from real income generated by the real production of innovative, new, real goods and services that improve productivity and generate domestic savings, which are in turn re-invested into the economy.

Now that the U.S. economy has lost much of its ability to generate high-paying jobs outside the business of asset speculation, U.S. workers inside the goods and services sectors (the making and doing part of the economy) are being asked to “share the burden” via more cuts in retirement and health plans and essential services, all the while leaving untouched the huge gains of the one percent at the top of the wealth and income pyramid, already enhanced by Bush’s tax cuts.

After more than 25 years of this, is it any wonder that the entire U.S. economy has organized itself to conform to this financial model: a gigantic, risky, one-way bet that uses trillions of dollars of credit and massive leverage, relying on the savings of foreign workers to fund the bet and the foreign central banks to cover the risks?

I couldn’t have said it better. Thank you, thank you, thank you, Eric, for speaking out.

Vembu’s Tenth Rule of Programming

Greenspun’s famous rule goes


Quote:

Greenspun’s Tenth Rule of Programming: “Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp”

Vassili Bykov explained the meaning of this: “that complex systems implemented in low-level languages cannot avoid reinventing and/or reimplementing (poorly on both counts) the facilities built into higher-level languages. Like garbage collection in OLE or keyword arguments in X”

I am not a Lisper. But I have witnessed something similar, which I will state below, borrowing Greenspun’s beautiful way of putting it. I have a feeling that some kind of mathematical equivalence can be established between Greenspun’s rule and mine, but that will be left as a research exercise ;-)

Vembu’s Tenth Rule of Programming: “Any sufficiently complicated C or Java program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of a relational query interpreter”

In fact, Paul Graham’s On Lisp book has an entire Chapter (Chapter 19) devoted to building a query compiler, which provides about half the functionality of what a half-way decent relational system will give you for free - as in Postgres, Firebird, SQLite, MySQL (partly free) - what size do you want?

Indeed, the traditional Map/Reduce functional paradigm in Lisp and other functional languages is partly doing the job of part of what a relational database would do. Microsoft Language Integrated Query (LinQ) is further evidence.

Why is Education so costly?

I read this article http://moneycentral.msn.com/content/CollegeandFamily/Moneyinyour20s/P136091.asp

It is just painful reading. Students getting into $25-50k, even $100K in debt to get an education is not right. I am sure any number of academics and economists would argue it is “worth” it. But is it worth 30% more in real inflation-adjusted terms than 10 years ago?

I believe part of the reason is that the education establishment just doesn’t operate with a sufficient degree of cost-consciousness, like any normal business subject to market forces should. And unlike private businesses they have no incentive to exploit technology to lower the cost of doing business.

Just a few examples I can think of to lower costs to students:

1. Student self-service model, with pre-recorded lectures, while students interact with professors via the internet to clarify the material

2. Internet based collaboration among students, with discussion groups and such, moderated by faculty

3. Dramatically reduce the administrative overhead - there are way too many administrators to actual classroom instructors in most universities today

4. Encourage more part time students - I just don’t see why an MBA has to be a full time program costing $100K or more

Finished (functional) code is like a finished theorem …

I read from Paul Graham’s “On Lisp”


Quote:

Finished code, like a finished theorem, often covers up a lot of trial and error. So don’t worry if you have to write several versions of a macro.

That pretty much captures the difficulty that functional programming causes practising software engineers. My PhD thesis in Information Theory is full of definitions, theorems and proofs. It has been over 10 years since I did that work, and I believe it would take me quite a while to understand it now. A finished theorem does cover up a lot of trial and error.

In fact, one of the debates I remember having with my PhD advisor was that I wanted to put in the trial and error reasoning that led to the results in the published work because it would help the reader understand where these came from. He advised against it, because that is just not the mathematical way. I guess that is one reason I came to industry :-)

Even in academia, best teachers motivate a theorem with trial-and-error reasoning. I love the cold, minimalist, abstract beauty of a well-written mathematical proof as much as the next guy, but the human process that led to it is often messy trial and error. By totally hiding it, we are covering the steps that can help the next guy trace the reasoning.

The academic popularity of functional programming style reflects this preference for the minimalist mathematical proof. In the world of team software development, the priority is not minimalist beauty but on understandable code. A mathematical proof, once finished, is finished - strictly that is not true, but at least that is the usual assumption. But software is an organic evolving system; it is never finished.

My preferred programming style is to make it blindingly obvious to the next reader of the program what is going on. If the person has to spend hours with paper and pencil (as often happens in figuring out a mathematical proof), that portion of the software will probably never be touched.

If we don’t want write-only code, we have to recognize the needs of the next programmer who reads a piece of code. If Paul Graham’s comment about macro design reflect the functional programming mindset (I believe it does), it explains why FP not widely embraced by the mainstream. I recognize that there are a lot of good ideas in there, and I recognize some of those ideas have made it to popular programming languages. But FP as a programming paradigm needs to get out of the “finished theorem” mindset, in my opinion, if it has to be embraced by the broader software industry.