Dynamic Web Servers

The simple mental model of a web server is that the path names a static file that is returned (html, image, text, etc)
However, the web server can interpret the path however it wants.
Typical web servers (e.g., the CIS web server) already have some files that are treated specially
- Files that look like ~kurmasz are converted into /home/users/kurmasz/public_html
- When requesting a directory, either index.html or a directory listing is returned
We can add “special” commands to our sample web server
- See ServerWithWhoCommand.java
- If the path happens to be who, whoText, or exit, the server does something special (as opposed to simply returning the file)
We would like to generalize this.
- Our current model requires adding code to the server (and re-compiling the server)
- That’s is certainly not efficient (and probably not secure)
- We want a way to have a server run code that isn’t part of the server (preferably a secure way)
- The high level concept is called “fork and exec”: One program can launch another
- Web server launches another program that generates the web page on the fly.
- The web server and the “child” program communicate through a kind of stream called a “pipe”.
- Idea: Have web server assume files in a special place (historically cgi-bin) are programs to be executed.
- Look at DynamicServer1.java
  - cgi-bin\Who.java generates a web page
  - cgi-bin\Now.java is a simple clock.
- Notice that we can add new commands without recompiling or restarting the server.
We want a way of writing code that generates HTML that isn’t tedious.
- Look at data/cgi-bin/Who.java. Having to surround all your HTML with string.append would get tiresome quickly.
- Perhaps Java, C, C++ aren’t the best languages for generating HTML (or, rather, long strings in general)
  - https://media.allauthor.com/images/quotes/img/charles-v-quote-i-speak-spanish-to-god-italian-to-women.jpg
- Scripting languages (perl, python, ruby, etc.) tend to be a bit more flexible.
- Examine data/cgi-bin/findSomeone.pl.
  - Notice most of the HTML is in the “here document” at the bottom.
  - This is better; but, it still buries the HTML inside another document.
- How could we do better?
- Examine data/cgi-bin/findSomeone.thtml and findSome.rb
  - thtml file focuses on html with a couple of placeholders
  - rb file does the computation.
  - placeholders substituted in at the last minute.
- Does this sound like anything you know / know about?
- This is how PHP got started.
PHP was the start of many attempts to simplify web programming
- Began as some macros (similar to what I did above) and expanded into a full programming language.
- PHP by default just spits out the text of the file.
  - Run php chadh01.html
- Anything inside the <?php echo "This is PHP" ?> is interpreted PHP code.
  - In other words, you can mix the actual code into the HTML
  - as much or as little as makes sense for your particular application.
  - Run cgi-bin\helloWorld.php
  - This is similar to the Ruby behavior; but, it lets you do more than just drop variables in.
- PHP is more tightly integrated into web servers. Therefor, it has a lot of functions built-in that are common to web programming:
  - Fetching query string parameters
  - Reading/Setting headers (including cookies)
  - Handling form data.
- You are welcome to use PHP for your project; but, lecture will focus on more recent development.
  - PHP originally designed for small amounts of code.
  - PHP not originally OO; but had OO “bolted” on to the side.
  - Thus, PHP can be feel awkward at times – especially when compared to more recently developed tools.
    - For example, large code-only files are entirely escaped with <?php .... ?>
Other common (and once common) templating engines
- Ruby ERB
- Java Java Server Pages (JSP)
- Razor (C# and other .NET)
- Jinja (Python – used by Flask)
PHP is not bad. It just didn’t have the benefit of hindsight that more modern examples have.

Recap

Servers can generate HTML on the fly
- e.g., generating a directory listing
Servers can launch external program whose output is a web page.
1. Could be just any other program that generates HTML in an ad-hoc method.
2. Could be a two-part system: code calculates several key variables, then they aer substituted into a mostly-complete HTML documents (e.g., the Ruby example I showed)
3. HTML and code are mixed

Many of the first “web apps” used technique #1 with perl.
- “Here” documents were a transition to technique #2.
Remember: High-level ideas are what is important, not the language details.
PHP was one of the first to employ technique #3:
- Document is basically HTML, but can switch to running code.
- Look at colorDemo.php in SampleCode/Templates
- Some code sections generate HTML. Others just contain code (e.g., set variables, define functions)
- You can even define methods inside PHP
Sometimes the PHP code itself generates HTML tags because the alternative (repeatedly opening and closing PHP sections gets ugly)
Notice the php section that just contains closing curly brace.
The last section uses a more modern functional technique (array_map); it’s a bit awkwardly implemented in and older language like PHP.
Ruby cleaned things up a bit with ERB. (See colorDemo.html.erb)
- Tag is shorter <%= vs <?php
- Two types of tag:
  - <%= for code that generates output (i.e. no need for explicit echo)
  - <% for code containing code without output (e.g, variable definitions)
- The last section shows a more modern functional programming approach
  - Less switching between HTML and code; but,
  - Code not as easy to read — especially if you are new to functional techniques.
EJS is a JavaScript library that behaves similarly.
Jinja is a Python library that behaves similarly.
These template engines (PHP, ERB, EJS) look a lot like HTML.
There is a second style of template that provide “shortcuts”
Show colorTemplate.haml
- No closing tag. Instead contents of tag are determined by indentation.
- This is also true of code. (Notice the “foreach” loops.)
- Short-cuts for adding classes and ids. (li:important or li#first)
- The = is a short-cut for grabbing variables.
Pug (formerly Jade) is a HAML-style template engine for JavaScript
ERB-style is easier to get started with; but colleagues at AO calim that HAML is better once you get used to it.
There are a few “#2”-style template engines, including Mustache and Handlebars
- Simpler, but not as powerful.
- Mustache is a syntax that has been implemented in many languages.
- Handlebars is a Javascript-only extension to Mustache.
- Both are a play on the appearance of this syntax: {{ }}
My recommendation:
- If you plan to work with PHP, this is a good opportunity to use PHP.
- If you plan to work with Rails, this is good opportunity to use ERB.
- Otherwise, pick one of the JavaScript engines. (Both work with Express)

Node / NPM

Node is a JavaScript engine that you can run from the command line.
Show a simple ‘hello world’ app in a clean directory.
npm is the Node Package Manager. It makes it easy to install libraries for your use.
- You must begin with npm init. (Accept the defaults for now.)
  - This creates a packages.json file describing the app and its dependencies.
- At a high level, there are two types of packages:
  1. Libraries containing code to import into your own JS projects.
  2. Command-line tools written in JavaScript.
- Use npm install foo to install packages.
  - By default, packages are installed locally to your node_modules directory.
  - Adding the --save flag will update the dependency list in node_modules
  - Adding the -g flag will install the module globally (i.e., make it accessible to all your projects.)
  - I would not use -g for libraries: It could lead to version conflicts.
  - You may consider using -g to install command-line scripts
  - If you choose to install a command-line script locally, you can access it through node_modules/.bin

Input from web apps

For web apps to be most useful, they need input from the user
- e.g., Name of city to get the weather in.
Two main sources of input GET and POST

GET

The last part of the URL is the query string
- A ? followed by a set of key-value pairs
- Key-value pairs are separated by &
- http://www.weather.com/daily?city=Allendale&state=Michigan
- Some characters (e.g., a space) need to be re-encoded
  - Space becomes %20
Now, we need a way of passing the query string into the code.
The convention is to use environment variables
- Specifically, the query string is passed in a variable named QUERY_STRING
- Server typically passes other helpful information in other variables
  - Request headers
  - DOCUMENT_ROOT
  - REQUEST_URI
  - SERVER_PORT
  - Show environment.php
    - (Can either run from link on course web page, or launch php -S 127.0.0.1:9000 from local directory.)
Notice that PHP does some pre-processing of these variables to make things easier for the programmer.

Forms

Used to collect data in a web page and send it to the server.
Show formDemo.php.
- Highlight syntax and different types of input.
- Highlight how name and value are used.
- Highlight the change in the query string.
The action may but need not be the same page. (In this case, it is the same page.)

POST

Second way to send data to the server is to Post it.
Data sent in the body of the HTTP request rather than in the query string.
- Remember that HTTP request ends with a blank line. Post data follows that blank line.
- Show how to observe values sent in Chrome developer tools.
- Notice that PHP helpfully parses this data and places it in variables so it’s ready to go.

GET vs POST

When to use GET and when to use POST?
Get should be used for idempotent actions.
What does idempotent mean?
Why should GET be idempotent?
- Philosophical: Fits original intent of GET and POST
- Practical: Query strings can be bookmarked (i.e., easy to repeat).
How about security?
Common rumor that GET is less secure. However,
- With HTTP, both values can be “sniffed”.
- WIth HTTPS both values are encrypted.
However, the “bookmarkability” of GET data / query strings makes it easier to accidentally leak data.

Other verbs

Rather than thinking about objects on the web primarily as files (either .html or code), many framework prefer to think of objects on the web as resources (e.g., books, toys, passengers, etc)
- There are four basic actions you can do to a resource: CRUD
- What does CRUD stand for?
  - Create
  - Read
  - Update
  - Delete
- HTTP has added verbs to support these actions
  - PUT: Supply a complete, modified version of a resource
  - PATCH: Supply instructions for updating/modifying a resource without completely replacing it.
  - DELETE: Delete a resource
- HTML forms can only use GET and POST, so web frameworks have to use workarounds for updates and deletes.
- https://en.wikipedia.org/wiki/Patch_verb