(my definition) is one which provides excellent support (and is an appropriate choice) for all types of computer programs - systems programming (device drivers, OS kernels, VirtualMachine
s), numeric computation, data processing, end-user application code, business logic, RealTime
, secure programming, etc. and so forth. The KeyLanguageFeature
s goes into a great deal of depth here.
"Excellent", for purposes of this discussion, means that "the language is a highly suitable choice for the application area; and one which would merit consideration for production use in a project with limited schedule and budget". TuringComplete
ness doesn't imply excellence; nor does the existence of a single prototype somewhere. Actual demonstrated experience in an application area does qualify; as does demonstrable similarity to another language with such experience. (I say this to make sure I don't exclude all but the MainstreamLanguage
s - however, I do intend to exclude claims like "some guy doing his thesis at MIT implemented an OS kernel in Java; hence Java is a systems programming language" from the discussion. The preceding example is fictitious).
Unfortunately, many of the KeyLanguageFeature
s needed for some applications are inappropriate and counter-productive for others. For example, the features used for low-level bit-frobbing needed for systems programming can be easily used to undermine any high-level abstractions that languages typically provide for applications at the other end of the spectrum. Languages without these features are generally preferred by high-level applications programmers, but then these languages become completely unsuitable for the low-level stuff.
At this point, it is claimed that there is no UniversalProgrammingLanguage
Many claim that there can never be.
The current approach, widely used, is to AlternateHardAndSoftLayers
- have a high-level language take care of the high-level stuff, have a systems language like CeeLanguage
do the bit-twiddling. The calling interface between the two languages generally exposes a subset of the called language (typically the low-level language; calls the other direction - having C code call Java code, for example - is often a questionable practice). In other words, the boundary between the two domains is a firewall
between the high-level and low-level stuff; just because a Java app can call C functions doesn't give the Java code access to PointerArithmetic
. (Of course, one could write a C library designed to export
C features to Java and then call that, but that's cheating).
Some languages do strive to universality. CeePlusPlus
is a higher-level extension to C, and it has quite a bit of high-level features (both OO and functional). However, it still has the C programming model, with pointers and such - and while the StandardTemplateLibrary
allows most instances of PointerArithmetic
to be excised from application code - the capability is still there. You still have to use
pointers (or references) in any non-trivial C++ program. Plus, one major area of concern not dealt with by C++ is memory management - the lack of GarbageCollection
makes C++ a difficult choice for many applications.
Another approach used is to have safe/unsafe subsets. ModulaThree
, I believe, does this - certain features are declared unsafe, and can only be used from modules/functions also declared unsafe. This allows an enterprise to restrict use of the unsafe features - a rule such as "application programmers shall use the safe subset only" can be more readily enforced; accidental use of unsafe features is caught by the compiler. ModulaThree
is a rather old language, and is missing quite a few useful things. (Maybe it's time for ModulaFour?
provides a mechanism for performing unsafe low-level pointer manipulation.
A third idea, often discussed in the literature, is to have a family
of languages, perhaps with a common subset, or with one language a superset of the other(s), designed for specific application areas. The languages in the family would share things like syntax and style; and a great many features would have same semantics. The interface(s) between different subsystems would be implemented with the "safe" subset. The toolsets for the languages in the family could be integrated, providing a seamless development environment while allowing different areas of concern to have only what they need. Probably the best example of this in widespread use are C (as the low-level language) and your choice of CeePlusPlus
as the high-level language (C++ and ObjC cannot be intermixed, but both are mostly supersets of C). CeePlusPlus
does leave much to be desired; ObjectiveCee
has other issues. Other languages with similar syntax (JavaLanguage
) provide ways to call C; but those languages both are sufficiently different semantics about them that it would be a great stretch to call them members of the C "family".
Hypothesis: At some point, 'UniversalProgrammingLanguage
' will likely be a high-level language that gets so high-level it overflows and becomes low-level again. Well... that's an interesting way to say it, but what I really mean is a basis in DeclarativeMetaprogramming
- programs will be written in terms of the goals they are to achieve (e.g. in terms of preconditions, postconditions, invariants, and various heuristics for what means better vs. worse) and in potential strategies for achieving specific goals, and possibly meta-strategies for combining strategies. Low level invariants could include such features as finite bounds for time and space.
- Strategies themselves could be simple sketches (e.g. something similar to pseudo-code) in terms of a particular organization of goals to achieve and under which conditions to achieve them. Those goals themselves may be associated with varying strategies. Via this mechanism, the language composes.
- the compiler will gather up all the requirements and priorities and information about the platform. It will examine, select, compose, tweak, develop, and optimize strategies while attempting to glue them together. That is, the compiler is not restricted to initial strategies provided by the programmer, and may have a huge database of strategies. The compiler is free to dump strategies all into a genetic fitness system, query various expert systems - including the programmer - for additional help, keep a database of previous failed and successful approaches to compiling the same or similar code, and even 'learn' over time (see ProofOptimizer).
- if the compiler can't figure out how to compile the program, it explains the problems (e.g. contradictory requirements, insufficient information, need for a more thoroughly sketched out strategy, etc.)
- once in a compiling state, the compiler will be producing a program that provably supports all specified requirements and does so with influence from developer policies. This includes the low-level bit-banging requirements necessary for integrating with the platform and its hardware.
This is essentially late binding of implementation
, or as late as possible while still keeping the strategy selection static
. Even later binding of implementation strategy is possible if you delay until runtime. AbstractFactory
is one such technique I have applied to good effect.
This was essentially the goal of the fifth-generation language projects of the 90s, particularly in Japan. Though general-purpose intent-oriented or requirements-oriented programming is probably inevitable, currently the process(es) for translating general and arbitrary sets of goals and constraints into efficient and usable executable code is(are) still largely unknown.
There is considerably more short-term promise in small, well-defined specialist domains. It is from experience with these that generality will grow.
In some respects, the closest semi-general examples of this concept in popular use are relational DBMSes, the development of which has taught us much about declarative language design (both good and bad), composition and optimisation based on sound theoretical (in the mathematical sense) frameworks, dynamic code generation, achieving durability and consistency, implementation of (relatively) seamless distributed systems, and so on.
I suspect it will eventually be achieved on a broad scale. But the 90s was before its time, a bit like the 70s was before ActorsModel
's time. I suspect a learning ProofOptimizer
that can leverage millions of compiles - and bits and pieces of programmer advice - from distributed user-systems, along with massive databases about the lowest level details of CPU and hardware, will be necessary to achieve DeclarativeMetaprogramming
in sense general enough to support time and space bounds and such. Essentially, one is building an ExpertSystems
programmer. Look forward to this becoming mainstream circa 2035. Maybe.
But weaker or more specialized forms will exist before then. A number of projects, for example, have already built optimizing cross-compilers using this approach. It happens to be one of the simpler and less painful and more extensible (easy to add new optimizations and architectures) ways to implement an optimizing cross-compiler. I plan on adapting it in the future for compiling
my own language, though it won't much help with the bit-banging and time/space requirements.
I find the above claim that there can be no Universal Programming Language amusing, because I could think of two.
First is CommonLisp
, a language that starts high-level, but can reach down as low as you would like to go, primarily with the help of macros. While I am not aware of any operating system that uses CommonLisp
as the implementation language, there have certainly been operating systems written entirely in some form of LispLanguage
, and it's hard to imagine it being unnatural to write an OS using CommonLisp
, except for the fact that operating systems these days are so complicated, that we might as well stick with what we got.
Second is ForthLanguage
, which starts out low-level, but can quickly grow as high-level as you would like to take it. While you don't get garbage collection with this one, it can give you a REPL on devices with surprisingly small memory footprints, say, for example, two kilobytes of RAM!
Because these languages transcend what we've come to expect from a computer language, I've come to call them "Transcendental"; there may be others that will fit the description, too. HaskellLanguage
, in particular, uses monads to provide surprisingly flexible syntax, and can compile to fast code; SmallTalk
*might* not qualify because of the fact that it's difficult to see how it can compile to fast code, but it is nonetheless likely a candidate; unfortunately, I'm not aware enough with PrologLanguage
to know how it would compare on this scale. (Although HaskellLanguage
is the language that gave me the idea of "Transcendental" languages, I'm not yet familiar enough with it, or even these other languages, to know how transcendental they can be in practice.)
See also: GeneralPurposeProgrammingLanguage