Forty Years of Computer Languages


Languages for the internet

Internet can be the major event of the new century, and it has been an epochal evolution, which changed many aspects of our lives.

At the beginning of the internet era, many, especially in the US, thought that the internet technologies could instantly generate an economic boom. This proved to be false; some firms, having the right product in the right moment, grew and become big, very big vendors; but the change was lower and less profitable than expected; most failed, it was the so called dotcom bubble. ( http://en.wikipedia.org/wiki/Dot-com_bubble )

In spite of that, internet was a slow but important revolution for everyone; today people have instant access to an enormous mass of information, spread all over the world, in real time: in this way everyone can show to all the world a picture of his cat or his preferred food... today the problem is is to get rid of all the useless information on the net to find what you need.

People forgave the situation we lived forty hears ago, when we had newspapers, but specific information and specialized books where expensive and difficult to find. You had to guess the book you needed (you couldn't read some excerpts), than go to the bookstore to order the book. If printed by a foreign editor, they had something named "cambio librario": a special exchange rate, to pay a dollar twice it's value and the book arrived (if lucky) after months.

At that time also specialized shops had a limited choice of merchandise, some goods where impossible to find in your town. Now (2019), the electronic commerce, thanks to giants of the distribution (as Amazon), allows also little shops to sell items worldwide and people is overwhelmed with commercial offers of all kind of goods. Sometimes there are problems to deal with distant and unknown sellers, but more often you can have goods at a decent price and in some days.

In the US, the internet began in the eighties, connecting academic institutions, and founded by the National Science Foundation. Commercial entities began to connect in in the nineties, changing the way internet was conceived, with advertisement, trademark problems, on line shops etc. etc.

Europe followed with a delay of some years, and Italy followed Europe with an additional delay; mainly due to the existence of a national monopoly in the telecommunication market. Italian institutions where also very vendor driven: there where big commercial agreements at high-level, and peripheral branches always had a difficult interaction with vendors. In my institute we could have what they sold us, not always the best equipment for our job.

At that time (1980-1990) each vendor had its own computer network and we had IBM SNA, connecting big IBM supercomputer, and Decnet, connecting the Digital VMS machines. There was a period in which on the same thin coaxial cable we had both: Decnet, Appletalk, Novel, and, later, TCP/IP.

But we where connected before internet had a wide acceptance in the country. In the late eighties we used mainly Decnet, connecting together the VAX computers of the Italian Astronomical institutes (AstroNet) and HEPnet, the network of the hight energy physics institutes, which had Digital VAX computers with the VMS operating systems. We could share documents, do remote processing and mail; all on textual terminals; there where no graphical user interfaces. I learned to use VMS and the Decnet network at Bologna University, where I was studying Astronomy.

When my group buied a Microvax-II (~1988), I began to use and manage it: the person in charge of the Microvax wasn't very interested in it, but I needed that machine. Most of the work done slowly on the big IBM of the institute, with its rules and queues, could be easily done with that little machine in less time, so I began to act as a system manager, and had also to manage the Decnet network.

There was a great confusion of net protocols at that time, with many interoperability problems; there where gateways conencting the different network, each one with its rules and we had to write intricated addresses even to send a simple mail. In that situation official entities began a standardization effort. The OSI network model was defined around 1984. This is the network model that is taught today at universities and cited everywhere; but I have seen only a single implementation of that model: "DECnet Phase V" , realized by Digital corporation for their Decnet network around 1990.

We migrated our Decnet network to OSI in 1993; it was a very special event, when the whole Decnet network migrated, everywhere,at the same time. But in few years the Decnet network dissolved, togheter to SNA and all the networks of that period. The success of the Unix workstations lead the world to their network protocol: TCP/IP.

TCP/IP is not carefully layered and structured as ISO/OSI , but works very well: it was born to connect few computers, but (tanks to some dirty tricks [1]) it is still alive in 2019, when all the world is connected. The ISO/OSI model remains in the networking book.

Unix, with the TCP/IP network, began to be extensively used at my institute only around 1994, when IBM sold us a number of AIX Risc 6000 machines with Unix; we also had some Sun, Silicon and Decnet workstations, each one with its different version of the Unix operating system. We had little support for these systems from the IT department, accustomed to IBM mainframes management, so I began to manage Unix systems, study TCP/IP and, to have a decent network connection, I had to maintain the little TCP/IP LAN of the building, with about one hundred computers connected.

In the following years the increase in network speed was impressive; in 1998 I had a 33.6 kbit/sec dial-up line at home and a 2 Mbit line at office, and many connection between scientific institutes were made by expensive 512 kbit/sec lines. In Italy due to the monopoly of telephone lines it was impossible to connect buildings separated by a public street without leasing an expensive line from the monopolist (Telecom).

But the progress in electronics lead to a lowering of price of connection equipments and to the introduction of new digital transmission technologies as ADSL. Around 1999, commercial ADSL offer where available to the general public at an affordable price; around 2000 we had, in Italy, the first ADSL offer with a speed of 640kbit/sec.

In the eighties the web wasn't yet born, we had "Gopher"; an information system of the Illinois University as a way to distribute documents. It had information organized following a hierarchical structure and a hierarchy of servers, each one with its documents. Also newsgroups [2] where important at that times. Some documents where also shared my mail, using the IBM network's listserver system, or stored on public ftp sites [3].

Then arrived the world wide web, with its way to distribute information and, around 1990, all began to change.

The web was an idea of Tim Barners-Lee, then working at CERN as a software engineer (1989); he needed a way to condivide documents among research groups using different computer systems. He conceived a system following a client-server pattern: there is a server program, running on a computer holding the documents (the web server), and clients programs (web browsers), running on the computer of the user, which ask for web pages and show the information on the screen.

The transmission protocol (named HTTP) is very simple; it consists mainly of simple textual request messages as: GET, POST, HEAD etc., transmitted over an ordinary TCP connection. The requested text is transmitted back to the browser by the server. In the original implementation, when a request has been satisfied, the TCP connection was closed: the HTTP protocol is stateless: there is no track of related requests.

Web documents (web pages) are written in HTML (Hyper Text Markup Language). HTML is not a real language; the HTML pages consist of simple text, with some words (tags) having a special meaning, which specify the structure of the document. Some tags are links: addresses of other pages, than can be retrieved by clicking on the link itself; this structure of freely interconnected documents is called hypertext. These links (named also URL: Uniform Resource Locator) have a precise syntax, which specify the transmission protocol, the server address and the path to the referred file.

The original idea is that the author of the HTML page defines the logical structure of the document by these HTML tags, and that the browser is free to render the structure on the screen as it wish; decoupling the logical structure from the presentation. For this reason the original HTML was very simple, no figures, no media. Pictures, tables, frames where added later, when graphical browsers where introduced; all this was far from the original idea by Barners-Lee.

The first graphical browser we had was X-Mosaic, developed by Marc Andreessen at NCSA (National Center for Supercomputing Applications, at the Illinois University). I still remember when we succeeded in installing X-Mosaic on a Decstation to see, in spite of our slow speed connection, the first images coming from the net; obviously pictures of pretty girls, put on line as a joke by the Pisa University Electronic Department.

I began to install web servers on all my computers around 1999. I found that using a web server to distribute big amount of data is easy and secure: you put your data on your server, then send a mail to your colleagues with the server address, they can download the data when they have time, with no hassle.

I also made some simple web sites; at that time I used the version 3.2 of HTML, with hand-made pages, without CSS, and all the intricated stuff that is used today on the web. At the beginning the HTML was very simple; here a very simple page:

<HTML>
     <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-HTML40/loose.dtd">
<HEAD>
    <TITLE> A Sample page </TITLE>
</HEAD>

<BODY>
   <CENTER><H1> An header here (fonts are big), centered  </H1></CENTER>
   <P>
      A paragraph here
   </P>
   <HR> An horizontal line, followed by a list:

   <UL>
     <LI> An first item of a list
     <LI> A second item
   </UL>
   <BR> This issues a line break
   <A HREF="http://www.helldragon.eu"> A link to my web site </A>
</BODY>
</HTML>

The structure is easy to be understood: there are tags, like <P>, <UL>, <LI>, defining the structure of the text. They are mixed with some tags for formatting the page, as <BR> for a line feed, or <HR> for an horizontal line; each tag has a corresponding ending tag, prefixed with a slash.

There are two sections: the "head", with some meta information for the browser, than the "body", with the page to render. Simple and easy. My wife could also teach to write HTML pages to pupils at a primary school (see the page: Mini-corso di HTML ).

The transport protocol is also very simple; this simplicity is one of the reason of the success of HTML, everyone could write a page and every programmer can write a simple server.

The evolution of this simple scheme is due first to the Mosaic browser (1993), with its graphical capabilities, then to the Netscape Communication Corp., which dominated the internet market in the 90s.
In those years the web was growing in an exponential way, with commercial entities entering in the market with their specific needs; Netscape added a lot to the simple scheme ideated by Tim Barners Lee: tables, form, a programming language embedded in the page and executed by the browser (Javascript), cripted connections etc.; the web changed.

In HTML the tags describing the content structure are mixed with tags for presentation. This is natural; humans communication always mix content and presentation features. When speaking, often more is expressed by posture than by words, but in software a strict separation of content and presentation is useful; an other problem is that the simple tags defined by HTML don't specify the detailed aspect of the document when rendered on the screen.

CSS is an attempt to address these problems; CSS are detailed formatting statements associated to the document. The version CSS-1 allows to control background, fonts, list appearance and defines a block model for elements; with margin, borders, padding. CSS are "cascade" style sheets: more specifications can be assigned to the same elements: by the browser, by the page author in an external file, by the page author in a tag; it's not always easy to guess the one which wins. In spite of CSS, the situation remained confused.

There is an intrinsic ambiguity in the way the web pages are shown: the display of pages depends both on the page features and the browser; authors can't have a real control over the browser, which runs in the client computer, interprets the page in its own way and can resize the pages depending on video size and user actions.

CSS-1 was still enough simple to be used, although the syntax is a bit more complex than HTML. Below an example of page which use some CSS for page formatting. In the <STYLE> part of the header is specified the font for the paragraphs (tag <P>) and the color for the <H1> tag. The <UL> tag has its own set of formatting specifications; these groups of reusable specifications are called "class". In this example the class named "hlist" is created and used for the <UL> tag; the CSS attribute "display: inline" renders the list in an horizontal way; A blue color is assigned by CSS to the last paragraph .

<HTML>
     <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-HTML40/loose.dtd">
<HEAD>
    <TITLE> A Sample page </TITLE>

    <STYLE TYPE="text/css">
         P {  font-family: sans-serif ; font-size:large ; }
         H1 { color: yellow ; }

         ul.hlist li {  list-style-type: none;  display: inline; padding-right: 20px;}
    </STYLE>

</HEAD>

<BODY>
   <DIV align="center"><H1> An header here (fonts are big), centered  </H1></DIV>
   <P>
      A paragraph here
   </P>
   <HR> An horizontal line, followed by a list:

   <UL class="hlist">
     <LI> An first item of a list
     <LI> A second item
   </UL>
   <P style="color:blue;" > This paragraph is blue </P>
   <A HREF="http://www.helldragon.eu"> A link to my web site </A>
</BODY>
</HTML>

In spite of these evolutions, inconsistencies in the web language, born in the chaotic situation of the browser war between Netscape and Microsoft remains ( https://en.wikipedia.org/wiki/Browser_wars): some formatting expression are in CSS, but others are still attributes of the HTML tags (although mostly deprecated). The syntax of CSS and HTML attributes is not uniform: some functions are duplicated, some are similar, but with different effects, some are implemented in different ways by different browsers.

To add complexity there is Javascript, the language embedded in the HTML pages, to be executed by the browser, for special effects. Javascript can modify the page, CSS specifications and tags.

Around 2000 writing web pages was a real pain, with the need of Javascript functions to adapt to different browsers and apply workarounds for their implementation bugs.

At the beginning, to control the situation, and have elements in a more or less fixed position, HTML tables where used, which have the ability to adapt to windows and screens of different sizes, but many authors began to make web pages with fixed sizes and standardized layout. Usually we had a top header, eventually with a banner image and an horizontal menu, sidebars, a central area with the main content and a footer with general information. This is the typical newspaper-like layout used in the years 2000-2015.

As the network speed increased, more and more images, commercials and videos began to be transmitted through the web; compressed formats where used as QuickTime (Apple 1999) or Adobe Flash (2002). Around 2005 real video distribution services began to be used, with millions of amatorial short video, as youtube (2005).

For some years, for my simple web pages, I continued to use HTML 3.2, with a very limited use of CSS and Javascript. It was only around 2001 that I really studied Javascript; it seemed to me that Javascript could be the path to a new way of writing internet application, and I tested it by writing a multi-page, multi-language site, in which a flag could be used to change the language of all the pages, using internally an array of pointers to the opened pages. I failed: this worked only on some browsers, and all page pointers where lost when pages where resized.

Javascript remained of limited use for many years, mainly used for testing forms on the client side and some graphical effects (and to patch browser problems).

When web servers resources where limited, the trend was to move computational load on the clients, executing javascript procedures to test forms, and adapt the page. When servers became bigger and faster, the trend changed and more work was done on the server.

Technologies to allow for server to create pages on fly, when requestd, where present also in the first web servers (SSI includes, CGI programs etc.). Netscape, since the beginning, had its server-side javascript (sold as a commercial product); but when computers became faster the server side languages boomed and became of common use. The more important languages for the web where Java and PHP.

The Java language was developed by Sun between 1991 and 1995, and was pushed, with a great advertising effort, as a way to solve the interoperability problem of the internet applications.

It is a C++ like language (but whitout pointers), compiled in an intermediate form (bytecode) on the server and distributed, via network, to the clients, which implement a Java virtual machine which executes the bytecode. Little Java application (applet) can run in the browser window.

In this scheme the bytecode acts as an "universal binary" , which each platform can run, by its own virtual machine. But in practice each virtual machine was a little different from the others and the Sun acronym for Java programming: (WORA) "Write Once, Run Anywhere", became "Write Once, Debug Everywhere" for the joy of java programmers.

Java is complemented by a very big library of classes, which can do almost everything. Java had a great diffusion for the production of big commercial internet products. It had success even in Italian schools, when, finally, the old professors, following the commercial trend, discovered something newer than Pascal.

A web page using an applet could be the following:

<HTML>
<HEAD>
<TITLE> applet test </TITLE>
</HEAD>
<BODY>
<H1> A very simple applet </H1>
<APPLET CODE="prova1" WIDTH=200 HEIGHT=200   >
    <PARAM NAME=x VALUE=100 >
    <PARAM NAME=y VALUE=100 >
</APPLET>
</BODY>
</HTML>

The page load the applet and and give same parameters to the applet. The bytecode is in a file named "prova1.class", produced by compiling the file "prova1.java":

import java.awt.*     ;
import java.applet.*  ;

public class prova1 extends Applet {
    int x ;
    int y ;

    public void init() {
             setBackground(Color.yellow) ;
             setForeground(Color.blue)  ;
    }
    public void start()   { }
    public void stop()    { }
    public void destroy() { }

    public void paint(Graphics g) {
        g.drawString(" string to be written", 20,20) ;
    }
}

Here the programmer has to use a predefined class: "Applet", which have to be extended, by implementing some predefined routines, as: start , stop , destroy.

I find Java very over-structured, with a very verbose syntax and a mess of classes to be extended, each with its peculiar interface which has to be known; most of the job is done by supplied libraries, but the programmer is bound to a predefined (and complex) structure.

Easier technologies arose in the following years, but Java is still widely used to build internet applications, also for simple applications for which the complexity of Java is an useless burden. But people know Java, learned at university, and use Java everywhere.

An easier server side language was the PHP. This language was born as a set of tricks to assemble web pages on the server, but had a great success and evolved in a complete programming language. It was often criticized for some inconsistencies and bugs, due to the way it was developed at the beginning, whitout a vision of the whole project.

PHP code, identified by specific marks, is embedded in web pages; the PHP interpreter, running on the server when the page is requested, act as a filter: it executes the code fragments embedded in the page, which produce some HTML code. The PHP syntax is simple; PHP is an easy and immediate way to manage pages on the server.

A simple example of PHP usage is the following:

<?php
   // ....... database connection
   $mysqli =new mysqli("hostname","userneme","password","database") ;

   // ........ the database query
   $querystring='select name,surname,email FROM ' ;
   $resp=$mysqli->query($querystring);

   // ... testing  if the query succeeded
   if ( ! $resp )
   {
    echo "<!DOCTYPE HTML>" ;
    echo "<html><head></head><body>" ;
    echo "Query error: (" . $mysqli->errno . ") " . $mysqli->error ;
    echo "</body></html>" ;
    exit() ;
}
?>

<!doctype html>
<head>
   <title> database query test  </title>
</head>
<body>
<h3> User table </h3>
<table>
<tr>
  <td> Name    </td>
  <td> Surname </td>
  <td> Email  </td>
<tr>

<?php    // again a fragment of PHP code

     // ............ looping on the found entries and filling the table
     while ( $row = $resp->fetch_object() )
     {

       echo "<tr> \n";
       echo "  <td> $row->name     </td>";
       echo "  <td> $row->surname  </td>";
       echo "  <td> $row->email     </td>";
       echo "</tr> \n";
     };

     // ........... closing the database connection
     $mysqli->close()
?>

</table>
</body>
</html>

We see that all the php code is between the marks: "<?php" and "?>". At the beginning we have php statements, with a connection to a database and a query returning some items. If the query fails, a simple web page, with an error message, is produced. Otherwise we go on with HTML statements defining a web page with a table, and HTML statements producing the table header. At last a fragment of PHP code, with a loop, produces the rows of the table, with data from the database.

The first version of PHP was released in 1995 and had an immediate success: around 2010 about 70-80% of the web sites with a server-side language where using PHP, in spite of all the criticism of purists which judged PHP a bad-conceived language.

I made some tests with PHP and Java, but I hadn't web application to write, so I gained only a limited experience in Java and PHP programming.



Notes

1
Each computer on the network has an unique identifier: the IP number. But the TCP/IP protocol allows only for 4 billions of IP numbers, they are all allocated since 2015. A new protocol: IPv6, allows for many more numbers, but providers didn't adopt the new protocol, they used tricks to go on with the IP numbers they have.

The most used trick is the "Massive NAT" : many providers give a temporary, local-only, IP number to connected customers and when they need to access the global network (for downloading files or browsing the net) they lend them a global IP number. The same global IP is used for many, many customers. This is a one-way route to the network: the customer can access the net, but can't have visible service on their computers; their IP number is "local" to the provider network. Most customers don't notice the difference, and don't complain.
2
Usenet newsgroups where a way to distribute messages, thought a network of dedicated servers; they where the main way to distribute informations and to chat on the net in the old times. They are still alive in 2019.

3
ftp is a protocol for file transfer, introduced in the seventies it is still used today; for details see the ftp page on the Wikipedia.
This text is released under the "Creative Commons" license. Licenza Creative Commons