Lexical Scope

Lexical scope means that scope is defined by author-time decisions of
where functions are declared. The lexing phase of compilation is essentially
able to know where and how all identifiers are declared, and
predict how they will be looked up during execution.

Before getting into more details, let’s have a look how JavaScript engine works when it comes to compilation.

Here is a breakdown of the parts:

Engine

Responsible for start-to-finish compilation and execution of our
JavaScript program.

Compiler

One of Engine’s friends; handles all the dirty work of parsing and
code-generation

Scope

Another friend of Engine; collects and maintains a look-up list of
all the declared identifiers (variables), and enforces a strict set of
rules as to how these are accessible to currently executing code.

Take the following statement:

var a = 2;
  1. Encountering var a, Compiler asks Scope to see if a variable a
    already exists for that particular scope collection. If so, Compiler
    ignores this declaration and moves on. Otherwise, Compiler asks
    Scope to declare a new variable called a for that scope collection.
  2. Compiler then produces code for Engine to later execute, to handle
    the a = 2 assignment. The code Engine runs will first ask Scope if there is a variable called a accessible in the current scope collection.
    If so, Engine uses that variable. If not, Engine looks elsewhere.
    If Engine eventually finds a variable, it assigns the value 2 to it.
    If not, Engine will raise its hand and yell out an error!

An LHS look-up is done when a variable appears on
the lefthand side of an assignment operation, and an RHS look-up is
done when a variable appears on the righthand side of an assignment
operation.

Two mechanisms in JavaScript can “cheat” lexical scope: eval(..) and
with. The former can modify existing lexical scope (at runtime) by
evaluating a string of “code” that has one or more declarations in it.
The latter essentially creates a whole new lexical scope (again, at runtime)
by treating an object reference as a scope and that object’s properties
as scoped identifiers.

There are two predominant models for how scope works. The first of
these is by far the most common, used by the vast majority of programming
languages. It’s called lexical scope, and we will examine it in
depth. The other model, which is still used by some languages (such
as Bash scripting, some modes in Perl, etc) is called dynamic scope.

Lex-time

The first traditional phase of a standard
language compiler is called lexing (a.k.a., tokenizing). If you recall, the
lexing process examines a string of source code characters and assigns
semantic meaning to the tokens as a result of some stateful parsing.
It is this concept that provides the foundation to understand what
lexical scope is and where the name comes from.
To define it somewhat circularly, lexical scope is scope that is defined
at lexing time. In other words, lexical scope is based on where variables
and blocks of scope are authored, by you, at write time, and thus is
(mostly) set in stone by the time the lexer processes your code.

We will see in a little bit that there are some ways to cheat lexical
scope, thereby modifying it after the lexer has passed by, but
these are frowned upon. It is considered best practice to treat
lexical scope as, in fact, lexical-only, and thus entirely authortime
in nature.

Let’s consider this block of code:

function foo(a) {
	var b = a * 2;
	function bar(c) {
		console.log( a, b, c );
	}
	bar( b * 3 );
}
foo( 2 ); // 2, 4, 12

There are three nested scopes inherent in this code example. It may be
helpful to think about these scopes as bubbles inside of each other.

Bubble 1 encompasses the global scope and has just one identifier in
it: foo.

Bubble 2 encompasses the scope of foo, which includes the three
identifiers: a, bar, and b.

Bubble 3 encompasses the scope of bar, and it includes just one identifier:
c.

Scope bubbles are defined by where the blocks of scope are written,
which one is nested inside the other, etc. The bubble for bar is entirely contained within the bubble for foo, because (and only because) that’s where we chose to define the function
bar.

Notice that these nested bubbles are strictly nested. We’re not talking
about Venn diagrams where the bubbles can cross boundaries. In other
words, no bubble for some function can simultaneously exist (partially)
inside two other outer scope bubbles, just as no function can
partially be inside each of two parent functions.

Lookups

The structure and relative placement of these scope bubbles fully explains
to the engine all the places it needs to look to find an identifier.

In the previous code snippet, the engine executes the con
sole.log(..) statement and goes looking for the three referenced
variables a, b, and c. It first starts with the innermost scope bubble, the
scope of the bar(..) function. It won’t find a there, so it goes up one
level, out to the next nearest scope bubble, the scope of foo(..). It
finds a there, and so it uses that a. Same thing for b. But c, it does find
inside of bar(..).

Had there been a c both inside of bar(..) and inside of foo(..), the
console.log(..) statement would have found and used the one in
bar(..), never getting to the one in foo(..).

Scope look-up stops once it finds the first match. The same identifier
name can be specified at multiple layers of nested scope, which is called
“shadowing” (the inner identifer “shadows” the outer identifier). Regardless
of shadowing, scope look-up always starts at the innermost
scope being executed at the time, and works its way outward/upward
until the first match, and stops.

Global variables are automatically also properties of the global
object (window in browsers, etc.), so it is possible to reference
a global variable not directly by its lexical name, but instead
indirectly as a property reference of the global object.

window.a

This technique gives access to a global variable that would
otherwise be inaccessible due to it being shadowed. However,
non-global shadowed variables cannot be accessed.

No matter where a function is invoked from, or even how it is invoked,
its lexical scope is only defined by where the function was declared.
The lexical scope look-up process only applies to first-class identifiers,
such as the a, b, and c. If you had a reference to foo.bar.baz in a piece
of code, the lexical scope look-up would apply to finding the foo
identifier, but once it locates that variable, object property-access rules
take over to resolve the bar and baz properties, respectively.

Cheating Lexical

If lexical scope is defined only by where a function is declared, which
is entirely an author-time decision, how could there possibly be a way
to “modify” (a.k.a., cheat) lexical scope at runtime?

JavaScript has two such mechanisms. Both of them are equally
frowned upon in the wider community as bad practices to use in your
code. But the typical arguments against them are often missing the
most important point: cheating lexical scope leads to poorer performance.

eval

The eval(..) function in JavaScript takes a string as an argument and
treats the contents of the string as if it had actually been authored code
at that point in the program. In other words, you can programatically
generate code inside of your authored code, and run the generated
code as if it had been there at author time.

Evaluating eval(..) in that light, it should be clear
how eval(..) allows you to modify the lexical scope environment by cheating and pretending that author-time (a.k.a., lexical) code was
there all along.

On subsequent lines of code after an eval(..) has executed, the engine
will not “know” or “care” that the previous code in question was dynamically
interpreted and thus modified the lexical scope
environment. The engine will simply perform its lexical scope lookups
as it always does.

Consider the following code:

function foo(str, a) {
	eval( str ); // cheating!
	console.log( a, b );
}
var b = 2;
foo( "var b = 3;", 1 ); // 1, 3

The string “var b = 3;” is treated, at the point of the eval(..) call,
as code that was there all along. Because that code happens to declare
a new variable b, it modifies the existing lexical scope of foo(..). In
fact, as mentioned earlier, this code actually creates variable b inside
of foo(..) that shadows the b that was declared in the outer (global)
scope.

When the console.log(..) call occurs, it finds both a and b in the
scope of foo(..), and never finds the outer b. Thus, we print out “1,
3” instead of “1, 2” as would have normally been the case.

In this example, for simplicity sake, the string of “code” we pass
in was a fixed literal. But it could easily have been programatically
created by adding characters together based on your
program’s logic. eval(..) is usually used to execute dynamically
created code, as dynamically evaluating essentially static
code from a string literal would provide no real benefit to just
authoring the code directly.

By default, if a string of code that eval(..) executes contains one or
more declarations (either variables or functions), this action modifies
the existing lexical scope in which the eval(..) resides. Technically,
eval(..) can be invoked indirectly, through various tricks (beyond
our discussion here), which causes it to instead execute in the context of the global scope, thus modifying it. But in either case, eval(..) can
at runtime modify an author-time lexical scope.

eval(..) when used in a strict-mode program, operates in its
own lexical scope, which means declarations made inside of
the eval() do not actually modify the enclosing scope.

function foo(str) {
"use strict";
eval( str );
console.log( a ); // ReferenceError: a is not defined
}
foo( "var a = 2" );

There are other facilities in JavaScript that amount to a very similar
effect to eval(..). setTimeout(..) and setInterval(..) can take a
string for their respective first argument, the contents of which are
evaluated as the code of a dynamically generated function. This is old,
legacy behavior and long-since deprecated. Don’t do it!

The new Function(..) function constructor similarly takes a string
of code in its last argument to turn into a dynamically generated function
(the first argument(s), if any, are the named parameters for the
new function). This function-constructor syntax is slightly safer than
eval(..), but it should still be avoided in your code.

The use-cases for dynamically generating code inside your program
are incredibly rare, as the performance degradations are almost never
worth the capability.

with

The other frowned-upon (and now deprecated!) feature in JavaScript
that cheats lexical scope is the with keyword. There are multiple valid
ways that with can be explained, but I will choose here to explain it
from the perspective of how it interacts with and affects lexical scope.

with is typically explained as a shorthand for making multiple property
references against an object without repeating the object reference
itself each time.

var obj = {
a: 1,
b: 2,
};
// more "tedious" to repeat "obj"
obj.a = 2;
obj.b = 3;
obj.c = 4;
// "easier" short-hand
with (obj) {
a = 3;
b = 4;
c = 5;
}

However, there’s much more going on here than just a convenient
shorthand for object property access. Consider:

function foo(obj) {
with (obj) {
a = 2;
}
}
var o1 = {
a: 3
};
var o2 = {
b: 3
};
foo( o1 );
console.log( o1.a ); // 2
foo( o2 );
console.log( o2.a ); // undefined
console.log( a ); // 2—Oops, leaked global!

In this code example, two objects o1 and o2 are created. One has an a
property, and the other does not. The foo(..) function takes an object
reference obj as an argument, and calls with (obj) { .. } on the
reference. Inside the with block, we make what appears to be a normal
lexical reference to a variable a, an LHS reference in fact (see Chapter
1), to assign to it the value of 2.

When we pass in o1, the a = 2 assignment finds the property o1.a
and assigns it the value 2, as reflected in the subsequent con
sole.log(o1.a) statement. However, when we pass in o2, since it does not have an a property, no such property is created, and o2.a remains
undefined.

Be first to comment

Leave a Reply