GVSU CIS 343
The Main Tradeoff
- Languages make many tradeoffs
- In my opinion, the main tradeoff is between performance and expressiveness
- The best performing languages require you to frame the problem from the computer’s perspective
- Machine code,
- assembly,
- C
- At every step you are (more or less) specifying exactly what the machine should do (this is called imperative programming)
- Because you have such fine-grained control over the low-level operation, it is more straightforward to optimize.
- However, it would be much easier if we could simply describe what we want the computer to do using the techniques we use to describe problems to each other.
- When doing this, we then need to translate the program into machine instructions. The further we are from the machine language, the less efficient the translation is.
- The best performing languages require you to frame the problem from the computer’s perspective
- We initially started with machine language, then slowly increased abstraction:
- First languages were machine languages: 1s and 0s.
- Created assembly languages to represent the 1s and 0s in human-readable form.
- Each assembly language statement corresponded to one machine language statement.
- Still had to rephrase the problem in a hardware-centric way, but it was easier to type.
- Next steps automatically converted formulas and more complex steps into assembly
- Fortran (FORmula TRANslation)
- C
- At first only moved away from machine language as far as existing compiler technology would allow. Hardware was expensive, which limited how complex compilers could get.
- Also, consider the following
for (int i = 0; i < size; ++i) {
c[i] = a[i] + b[i];
}
vs.
for (int i = 0; i < size; i+= 2) {
c[i] = a[i] + b[i];
c[i+1] = a[i+1] + b[i+1]
}
vs.
int* end = c + size*sizeof(int)
while (c < end) {
*c = *a + *b;
++a; ++b; ++c;
*c = *a + *b;
++a; ++b; ++c;
*c = *a + *b;
++a; ++b; ++c;
*c = *a + *b;
++a; ++b; ++c;
*c = *a + *b;
++a; ++b; ++c;
*c = *a + *b;
++a; ++b; ++c;
}
Top code is easiest to read; but bottom code is faster. Over time compilers have improved to provide for “bottom” performance using “top” code.
- Then, in the late 50s and early 60s, we had enough computing power to re-think languages from the perspective of the programmer:
- What types of things do programmers want that machine code / assembly language doesn’t provide?
- variables with meaningful names. (Early languages limited the names you could choose)
- loop constructs. (Initially done with branch / goto)
- scope / encapsulation (The ability to re-use variable names and/or treat functions independently)
- data structures (structs, records, unions, objects, etc.)
- In what ways to do you conceive problem solutions differently from a computer?
- What types of things do programmers want that machine code / assembly language doesn’t provide?
- Natural language is, of course, still science fiction, but people looked at ideas “in the middle” of the tradeoff spectrum
- Early innovations (scope, data structures) — helped programmers better organize / keep better track of what they were doing.
- Object-Oriented Programming (e.g., Smalltalk)
- Allowed programmers to switch primary focus from algorithms to data.
- Earliest programs were primarily scientific calculations and, therefore, very algorithm-centric
- As computers became more mainstream, they were used more and more for business, which is more data-centric.
- Inheritance is an important type of re-use.
- Functional Programming (e.g., LISP)
- Logic Programming (e.g., Prolog)
- In my opinion, the #2 tradeoff is between writeability and reliability.
- Languages that are easier to write (think Perl, JavaScript) tend to have less strict syntax and/or type checking. However, this means that the compiler can detect fewer errors potentially leading to more run-time errors
- From here on, most of what you see is
- different approaches for increasing expressiveness while maintaining reasonable performance and, to a lesser extent,
- approaches for writing code more succinctly without making the code hard to maintain/debug.
Other key features / tradeoffs
(In other words, what makes a language “good”?)
- Limited / careful use of operator overloading
- How many different ways is
*
used in C?- multiplication
- Pointer declaration
- Pointer dereferencing
- How many different ways is
&
used in C++?
- How many different ways is
- Names have a clear, obvious meaning
- What does
static
mean in C/C++? - What does
grep
mean in the UNIX environment? Is there any way to guess this based on general computing knowledge?
- What does
- Orthogonality vs. obscure behavior
- Ideally, keywords and constructs have the same behavior regardless of context
- The C pointer can be applied to any data type
- Any data type can be a return value in java, but you can’t return an array in C.
- However, you don’t want to do this at the cost of having obscure, strangely-defined behaviors.
- Poor overloading is one example of poor orthogonality.
- Ideally, keywords and constructs have the same behavior regardless of context
- Expressiveness vs. Simplicity
- In C/Java how many ways are can you think of to add 1 to
count
?count = count + 1
count += 1
++count
count++
- What is the tradeoff?
- Allowing multiple options makes it easy / more concise to write, but potentially harder to read. (Person reading the code needs to know all the “tricks”)
- Taken to the extreme, you can have “write only” code (like Perl)
- Balance: Can easily write what you want, but only need to know a few constructs to both read and write the code.
- In C/Java how many ways are can you think of to add 1 to
- “Writeability” vs. Reliability
- Reliability refers to how likely you are to catch bugs.
- In general, the more errors the compiler / environment can catch, the sooner, quicker, and easier you can fix those errors.
- Many of these “catches” require type checking.
- Better type-checking generally requires being more explicit when writing code (think Java). Less type-checking allows for more “short-cuts” (think Ruby, JavaScript, and other scripting languages), but many mistakes then can’t be detected until run-time, where they are harder to precisely identify and fix.
- In general, the less you type, the harder it is for the compiler to catch mistakes for you.
- Readability
- Limited number of operators (COBOL has hundreds)
- Even if you don’t use them, you need to know to avoid them when choosing variables.
- Enough data types so you don’t have to do awkward things (like use ints for bools in C)
- Prohibits variable names that match special words (e.g., no variables named “while”)
- Which is better: Braces for blocks
{}
or should the begin and end be more distinctend if
,end while
etc? - What about nice readable
end if
vs concise but goofy lookingfi
,od
?
- Limited number of operators (COBOL has hundreds)
- Readability / Writability tradeoff:
use v5.10;
while (<STDIN>) {
chomp;
say if (/z/)
}
vs.
use v5.10;
while ($_ = <STDIN>) {
chomp $_;
if ($_ =~ /MATCH/) {
say $_;
}
}
- Support for Abstraction
- DRY: Don’t repeat yourself.
- How easily do languages let you re-use code and/or data structures?
- How can you re-use code / data structures across types without complex syntax to allow for type
Static vs. Dynamic typing
- Static typing
- Compiler makes sure all operations are legal before the program begins.
- e.g., complains if you call a bark method on a Cat object.
- Compiler can catch more errors, but
- Can make code more verbose.
- Think about interfaces / generics in Java
- Compiler makes sure all operations are legal before the program begins.
- Dynamic typing
- Validity of operations not checked until program is running.
- Cuts out annoying syntax, but
- Leads to more run-time errors. (e.g., pass an object to a “sort” method that doesn’t have a “compare” operation)
Compile vs. Interpret
- Compile
- Main advantage: speed
- Tend to be statically-typed (thereby promoting reliability)
- Pure interpreted
- Main disadvantage: 10x to 100x slower.
- Tend to be dynamically typed (thereby promoting writeability)
- Hybrid
- Key example: Java