Skip to content

Latest commit

 

History

History
743 lines (533 loc) · 21.6 KB

talk.org

File metadata and controls

743 lines (533 loc) · 21.6 KB

Everyone has a 1h slot, apparently, so that’s about 45min for the talk plus 15min for questions.

There are 5minutes in the plan below, but I suspect that the XEmacs and post-XEmacs period could be cut down significantly, and or partly merged into the previous parts

Intro

  • Why are we here?
  • Why is Emacs’s language a Lisp?
  • How did it started?
  • What did TECO code look like?
  • What’s the relation between Emacs and the GNU project?
  • So Emacs’s purpose is to empower the user?
  • What does Emacs Lisp look like?

Scoping

FIXME: Mention buffer-local scoping and interactions with `let`? [ Why oh why dynamic scoping? ]

Why dynamic scoping

[ Unaware of lexical scoping? ]

  • Scheme guys next door!
  • Familiarity with efficient implementation via shallow binding
  • Experience with Multics Emacs (Maclisp) suggested API styles relying on implicit arguments passed by state or dynamic scoping
  • Dynamic scoping’s “transparency” increases extensibility.
  • In hindsight this can be read as “Gives more power to end users, in line with Free Software ideals!” “Free Software ideals” arguably did not exist at the time, at least not under that name, but the underlying idea of empowering users was clearly part of Richard’s general approach to program design.
  • dynamic scoping gives the user more freedom to take existing code and change its behavior without touching it (tho not enough yet, hence the addition of hooks)

=> Dynamic scoping was considered necessary!

Why not lexical scoping?

[ Fair enough, but why not also lexical scoping? ]

  • The desire to keep the language simple
  • The need to fit the limited resources available at the time
  • Dynamic scoping was sufficient!

Living with dynamic scoping

[ So, what’s it like to live with dynamic scoping? ]

  • You avoid conflicts with a naming convention
  • You build “closures” by hand as `(lambda (arg) (+ ‘,arg 1)) Occasionally it’s quite handy since you can plug in actual code instead of values in there (in both senses) ;-)
  • GOOD: mouse-save-then-kill-delete-region:trunk/lisp/mouse.el (uses inhibit-modification-hooks and protects buffer-undo-list) but BAD: mouse-save-then-kill-delete-region:old/lisp/mouse.el (binds after-change-functions)

Difficulties retro-fitting lexical scoping

[ This was then, but those limitations disappeared at least 20 years ago! ]

  • Early ELisp code relied heavily (and silently) on dynamic scoping. Converting all that code to lexical scoping seemed daunting
  • A lot of that code was “out there”, so can’t be fixed easily
  • Need some “automatic way” to keep supporting that old code
  • Eli Zaretskii and Richard remind developers to stop working on language infrastructure and improve the end user experience instead
  • Dynamic scoping was sufficient!

First path to lexical scoping

[ Is it really so hard to just add lexical scoping constructs? ]

The CL package offered a lexical-let macro, back in 1993, but it never really caught on:

  • Richard did not like the CL package (nor Common-Lisp for that matter) and discouraged its use (and hence the use of lexical-let)
  • It was a bit verbose
  • It did not include a similar facility for lexically scoped formal arguments
  • More painful to debug
  • Somewhat inefficient
  • Did I mention that dynamic scoping is sufficient?

Second path to lexical scoping

[ If a purely additive lexical scoping won’t fly, how about auto-conversion? ]

Guys over in XEmacs land tried that.

  • In 2001, Matthias Neubauer implemented a code analysis to discover all the places where lexical scoping could be used without affecting the semantics
  • This opened the way to automatically convert old ELisp file to start using lexical scoping; the plan was to convert ELisp code to Scheme eventually
  • But nobody took up the challenge [Mike says it’s “research”]
  • Do I hear someone in the audience say that dynamic scoping is sufficient?

Third path to lexical scoping

[ Surely, there’s a way to add enough annotations to solve this problem, no? ]

Meanwhile, in Emacs land, I managed to tell Miles Bader that it was possible to just use lexical scoping by default and still preserve the old semantics in most cases (by relying on enough dynamic ad-hoc tricks and heuristics)

  • I was wrong, of course, but he didn’t know, so he started working on it
  • His approach was to add a new ELisp dialect Mixes lexical and dynamic scoping in a similarly was to Common-Lisp
  • Every ELisp file now has to declare which dialect it uses
  • Every function remembers which dialect it’s using internally
  • The two dialects are (mostly) indistinguishable
  • Easily preserves full backward compatibility, without ad-hoc tricks
  • Bringing full support for the new dialect proved hard (arguably for lack of expertise in implementation of closures)
  • The branch lingered unloved
  • After all, dynamic scoping is sufficient!

Final path to lexical scoping

[ Really? It’s all designed and working and it’s left there because no one can be bothered to learn how to do a simple closure conversion? ]

Finally, in 2010, I got a student to implement a proper closure conversion phase as a summer project

  • Scrapped Miles’s attempt to adjust the byte-compiler
  • Changed the compiler from single-pass to multiple-passes
  • In turn forced macro-expansion as a separate pass
  • It took 2 more years to fully integrate it
  • Ended up affecting a lot of other code (sometimes simplifying it)
  • Luckily I was head maintainer so I had a freer hand to take such risks

Is dynamic scoping sufficient?

[ Wait! If it’s sufficient, then why does Emacs now offer lexical scoping? ]

  • Peer pressure
    • Developers with backgrounds in programming language design and theory
    • Even a head maintainer doing research in type theory
  • Little annoyances that accumulate
    • Not the need to follow naming conventions (we need that for lack of modules anyway)
    • Not the lack of closures (can be approximated with `(lambda ..))
    • The need for obfuscated names in local variables in select places like higher-order functions and in the byte-compiler
    • BAD: let-binding of minibuffer-completion-table in completing-read-default:emacs-27/lisp/minibuffer.el
    • BAD: from cvs-every:old/lisp/pcvs-util.el:
    • BAD: batch-byte-compile:old/lisp/emacs-lisp/bytecomp.el
  • The belief that dynamic scoping will be a hindrance in the long term: incompatible with optimization, poor interaction with concurrency; exclusive reliance on it for all bindings imposes tight constraints on the implementation of dynamic and buffer-local scoping

Popularity of lexical scoping

[ So if it’s not that important, will it end like lexical-let? ]

By 2010, most ELisp code was agnostic, because coding styles had changed, partly because of conscious effort in terms of compiler warnings and in terms of changes to Emacs’s own code (which people tend to copy).

  • The new dialect with lexical scoping is not more verbose
  • It’s not less efficient
  • It’s not harder to debug
  • It’s more like what people are used to
  • There’s no reason to use the old dynamically scoped dialect for new code
  • For most well-maintained code, the conversion is easy
  • Emacs’s own 60MB of ELisp code has now been fully converted, by hand

=

Tail calls

Why no proper tail calls?

  • ELisp function calls use the C stack, and the C compilers did not do TCO, so it would have required extra work.
  • ELisp implementation makes function calls relatively expensive, so a `while` loop is a lot more efficient than a recursion
  • TCO is incompatible with dynamic scoping
  • ELisp coding style was very imperative, with a lot of global state. Not sure which is the cause and the consequence.

But … even after 30 years?

  • Indeed, now that we have lexical scoping TCO is an option
  • Coding style is less imperative
  • But the other points still hold
  • 30 years of dynamic scoping and no-TCO have shaped the coding style, so on existing code TCO brings no benefit
  • “non-self-TCO” would negatively affect the debugger’s backtraces
  • If it’s only an opportunistic optimization, code cannot rely on it, so style won’t change.
  • But if it’s guaranteed, then it imposes an extra burden on current and future implementations. How do you define “self-TCO”?

So … never?

  • Patches have been submitted several time for the byte-compiler.
  • Long discussions, several times.
  • It’s never been completely ruled out, but it never convinced either: The immediate gain is small at best, and noone has bothered yet to provide TCO for the ELisp interpreter, so it always falls into the “opportunistic optimization” trap.
  • In Emacs-28 `cl-labels` circumvents this by doing self-TCO during macro-expansion. The main purpose is to provide a reasonably efficient implementation of Scheme’s `named-let`. Doing it via macroexpansion ensures it applies to all backends (interpreter, byte-compiler, you nameit), so source code can rely on it. “Classical” TCO would only make `named-let` work with a finite stack, but it would still run slow because of the overhead of a function call.

So you don’t care about safe for space?

  • I’ve completely let down my thesis advisor, indeed.
  • But not to worry: it’s much worse than that. The conservative stack scanning already breaks space safety, and to add insult to injury, the construction of closures in the interpreter captures the complete environment rather than only the free variables!

=

Modules

So, XXI century and no modules, eh?

  • Of course, they were absent at first, to keep the system simple
  • Richard does not like Common-Lisp’s package system
  • Dynamic-scoping imposes a naming convention, so in practice we do have some poor man’s namespace mechanism in place
  • More than 100MB of ELisp code says this works

EXAMPLE: isearch-mb

OK, it works, but there are other ways, no?

  • Indeed, and there have been many attempts to add such things to ELisp: names.el nameless.el …
  • Lots of long discussions

And?

  • The benefit is unclear.
  • There are some downsides:
    • Of course the added complexity Not so much in the language implementation but in the documentation, and the surrounding tooling. Currently you can use `grep` or even duckduckgo to find uses or definitions of ELisp functions and variables!
    • The cost of retrofitting a namespace mechanism onto those 100MB is fairly high.
  • The most workable solution appears to be a very lightweight and simplistic/lexical one which is compatible with the current naming convention, so we can start using with existing code without any change.

=

Stefan

Important things:

  • opaque (XEmacs-vs-Emacs approach to GC/pdump and redisplay)
  • user-power (Free Software, language powerful, language accessible, hooks)

Mike prepares:

  • an implementation language rather than an extension language
  • macros

Stef prepares:

  • dynamic scoping
  • tail calls & modules (what would bob harper say?)

Background 10min

Emacs 8min

  • TECO Emacs => Multics Emacs => Gosling Emacs => GNU Emacs
  • ELisp designed as the implementation language of Emacs
  • Free Software philosophy embodied in the design
  • Development model
    • During TECO Emacs => early GNU Emacs
    • Around Emacs-19 (schism and such)
    • in XEmacs
    • in Emacs≥21

Early language design 2min

  • maclisp
  • mocklisp

Base ELisp 10min

“Normal” language 5min

  • scoping
  • macros
  • absence of records/structs/…
  • non-local exits

Emacs-specific 5min

  • hooks
  • docstrings
  • interactive functions
  • strings
  • buffer-local vars
  • I/O

Implementation 10min

code

  • byte-code
  • bootstrap
  • debugging
  • profiling
  • jit
  • tail calls

data

  • overview
  • stack scanning
  • tag bits
  • vectors
  • weak pointers
  • incremental GC
  • dumping

XEmacs period 10min

  • events&keymaps
  • characters
  • ffi
  • aliases
  • unicode
  • bignums
  • specifiers
  • performance improvements
  • Custom

post-XEmacs 10min

  • lexical scoping
  • cl-lib
  • pattern matching
  • gv
  • OO
  • generators
  • concurrency
  • inline functions
  • modules or lack thereof

Conclusion 5min

  • macros allow “wild” experimentation and domain-specific extensions
  • simple core, with conservative evolution

Mike

Issues:

  • When to do “Stefan = Emacs, Mike = XEmacs”?
  • How much of the Emacs vs. XEmacs history?
  • Where to put free software ideals?

Steele + Stallman: TECO => Emacs keybindings

Reenact the original scene.

Lisp overview

Come from TECO language.

Then have a conversation.

“Obscure PL” vs. Scheme (academic) / Lua (didn’t exist) / Python (didn’t exist) / Tcl (didn’t exist) / JavaScript (haha)

Dynamic binding: What is it good for?

Stefan can show what dynamic binding is good for.

Mike’s a Schemer, so likes static binding -> early work in XEmacs.

Stefan: “So where is it now?”

Mike: “Research!”

Stefan: “Static scoping in Emacs implemented.”

Mike: “Boring! What do you do for a day job?”

Macros

XEmacs vs. Emacs

Mike: It’s 1990, Emacs is dead. …

Mike: “We did this 20 years ago.” Stefan: “But look where we got without it.”

  • opaque datatypes
  • portable dumper
  • incremental GC vs. “we turn off the message”

What’s Next

“But isn’t Emacs dead, given how old it is?”

Stefan can tell the native-codee story, Mike can complain.

Free Software

Empowerment / Accessibility / Free Software

M: So Stefan, I noticed when we were working on the paper, you use only free software, right?

S: Right.

M: Isn’t that pretty extreme?

S: …

M: What does Emacs Lisp have to do with these goals?

S: Well, the central goal of Free Software is to free users from the restrictions of commercial software.

M: What do you mean restrictions? Can’t I just buy any piece of software I want?

S: As long as that piece of software is Microsoft Office or SAP.

M: That seems to be what a lot of people want.

S: But it seems a waste of computers, the most adaptable machine in human history, to only be used in a couple of ways.

M: Yeah, well, I work on a lot of digital-transformation projects, and boy are people not served by stock software when that happens. So I understand how Emacs Lisp, being the language of Emacs, empowers programmers to configure their IDE.

S: What do you use for e-mail, Mike?

M: Gnus in Emacs.

S: How do you organize your daily work?

M: org-mode in Emacs.

S: How do you write letters?

M: AUC-TeX in Emacs.

S: How does your administrative assistant write letters?

M: AUC-TeX in Emacs.

S: Now, other people in your office use Emacs, right? Are their setups all like yours?

M: No, they’re completely different.

S: So Emacs - and Emacs Lisp - has empowered all of these people to have a productivity tool that fits their needs. Could that have been achieved by just setting a bunch of configuration options?

M: No, there’s a bunch of code in my init.el.

S: Get the point?

M: Yeah. But it’s still a restricted set of applications, right? “Things that require an editor.”

S: Well, at one point, German air-traffic control was running in Emacs Lisp.

M: Was that a good thing?

S: Well, if you read it up, it really enabled folks on a clunky VMS system to do high-level programming. In one of the better-debugged runtimes available.

M: But what if there’d been a bug in that runtime - with commercial software, you can call support, and they’ll fix the bug.

S: Ever tried that?

M: I was being sarcastic. So with free software, you have what I guess you’d call a fighting chance: If you can’t fix a problem yourself, you can hire somebody to do it.

I’ll still point out that Emacs is mostly a tool for experts. I’m running a software company, we’re all computer people. When Stallman founded the GNU project, he did it to empower users, but when he said “user” he really meant “people who hang out at MIT lab”.

S: If you look at the Emacs Lisp manual, you’ll see that it’s written for programmer newbies.

M: Yeah, well´, I guess he was envisioning a future where more people would routinely program. That never happened.

S: None of Stallman’s fault. At least he tried, and if you’re a user, you at least have that chance you mentioned.

XEmacs and Emacs

M: Hi, I’m XEmacs. I’m glad you noobs finally figured out how to do portable image dumping. We did that 20 years ago.

S: But then you fell asleep at some point.

M: Yeah, well I fell asleep waiting for the GC in Emacs. We have an incremental GC. You just turned off the message that tells users Emacs Lisp is GCing.

S: You just had one of your students do the work on that one for free. You could have told him to do it in Emacs instead.

M: No, I couldn’t. Because you didn’t have portable image dumping figured out.

S: What does that have to do with it?

M: The portable dumper works from a uniform description of the object layout. The XEmacs GC uses the same uniform description to do pointer tracing. You have that now, right?

S: FIXME Stefan

M: Why didn’t you implement that earlier?

S: FIXME Stefan

M: So what you’re saying is that Emacs’s internal architecture still sucks, after all these years.

S: At least we’re still around. And speaking of architecture, we have static scoping now, which you failed at.

M: At least we had data abstraction.

S: What is that supposed to mean?

M: You still use lists for keymaps, right?

S: Yeah, you have a problem with that?

M: Well, you invite people to just use list functions on keymaps, breaking any abstractions you might have in place. That means you can never change the format.

S: Well, we did change the format a bunch of times.

M: And how did that go?

S: So how did you fix this problem in XEmacs?

M: XEmacs, from the beginning, had more opaque datatypes which you could only manipulate using specific functions for that datatype. Keymaps are an example. This way, we can change the underlying implementation without changing the interface.

S: Is that the reason you forked from Emacs?

M: No, that was because a company named Lucid needed a more advanced version of Emacs for one of their products. That version was supposed to be Emacs 19, but it didn’t come out for like forever. So Lucid made their own version. It was called Lucid Emacs originally.

S: So it seems you did a bunch of things right technologically but couldn’t hold it together long enough.

M: Yup. Fun times, though.

S: By the way, Mike - the Clojure folks seem to be doing the same - expressing everything in terms of the built-in data types like lists and vectors and maps, instead of defining new ones.

M: Yup. Same mistake there.

Outline 2021-05-20

personal intro

Why are we the authors of this paper?

Why Lisp?

  • Stallman’s favorite language
  • not first Emacs
  • first one written in TECO

Emacs as founding stone of the GNU project

  • users have right to modify
  • Emacs lisp directed at non-programmers

Emacs Lisp as a classic Lisp

[Mike]

  • parenthesized syntax
  • expressions

  • indentation to enable readability
  • bindings and scoping

All this essentially MacLisp

Dynamic Scoping

(defun f (x)
  (+ x y))

(let ((y 17))
  (f 5))
  • why: Stefan demos isearch-mb–message
  • [also: string properties]
  • buffer-local variables

Evolution

Static scoping

  • static scoping: example does not work in Emacs

Questions:

  • Why dynamic? Wasn’t RMS aware of lexical scoping?
    • Scheme guys next door
    • Efficient implementation
    • Experience with Multics Emacs
    • “Transparency” (empower the user)

    => dynamic scoping was necessary

  • OK, but why not both like in CL?
    • keep language simple
    • resource constraints
    • dynamic scoping is sufficient
  • How do you live with dynamic scoping?
    • naming convention
    • hand-made closures
    • Examples in mouse-save-then-kill-delete-region:lisp/mouse.el
  • Resource constraints don’t exist any more, so?
    • more than 20 years of code relies on dynscope
    • code “out of reach”, so have to preserve compatibility
    • changing ELisp does not directly benefit the user
    • dynamic scoping is sufficient
  • Is it so hard to add lexical scoping? CL offered lexical-let in 1993, but it never caught on
    • Richard didn’t like CL
    • Verbose
    • Not available for lambda/defun
    • Ugly in debugger
    • Inefficient
    • dynamic scoping is sufficient
  • How ‘bout auto-conversion? [ Mike ] …
    • Do I hear someone in the audience say that dynamic scoping is sufficient?
  • Surely we can solve this problem? describe Miles’s solution
  • So it was left there to rot?
  • Stefan: story of static scoping
  • Mike: You know, I also worked on static scoping once.

Macros

  • other than that, not much evolution in core language because of macros
  • syntax as data
  • macro example
(let ((array-size 11)
      (init-val 'init)
      (array-name 'a))
  `(do ((i 0 (+ 1 i)))
       ((>= i ,array-size))
     (aset ,array-name i ,init-val)))

(defmacro fill-array (array-name array-size init-val)
  `(do ((i 0 (+ 1 i)))
       ((>= i ,array-size))
     (aset ,array-name i ,init-val)))

(defvar s "foobar")

(fill-array s 6 65)

(macroexpand '(fill-array s 6 65))
(macroexpand-1 '(fill-array s 6 65))
  • lambda expressions as functions
(mapcar '(lambda (n) (1+ n)) '(1 2 3))

(macroexpand '(lambda (n) (1+ n)))
#'(lambda (n) (1+ n))
  • -> compilation

Emacs vs. XEmacs

  • Stefan: Why do you have this XEmacs thing open?
  • What are notable differences?
  • portable dumper
  • GC
  • opaque datatypes
(defstruct kons kar kdr)

(make-kons :kar 1 :kdr 2)
  • module system