You are currently browsing the category archive for the ‘GrammarWare’ category.

Round-trip engineering is a functionality of software development tools that provides generation of models from source code and generation of source code from models; this way, existing source code can be converted into a model, be subjected to software engineering methods and then be converted back.

Round trip is a particular model perspective in the UML standard, and it covers conceptual elements of the full execution use case that happens in an implemented system. This includes the network communication, in typical web-based applications.

Round-trip engineering vs. Symbolic Analysis

When rount-trip engineering creates typical UML models from code, Symbolic analysis instead creates symbolic atomistic models from code: from GrammarWare to ModelWare.

Some links:

Advertisements

In this story we start from the known definition for Expressive power to extend its scope to semantics programming languages, which have numerous ways to express the specific meanings (below 8 formalismis are listed) of tems of languages. We suggest that (near) all of these semantic principles can be combined in the atomistic semantics, which is a novel hybrid construction, a high-abstraction way to simulate source code.

What is Expressive power

Expressive power is in computer science a feature of a language, which refers to: 

  • what can be said in the language (at all)
  • how concisely it can be said.

Formal discussions mostly use the term in the first sense, leaving conciseness as a less important factor. This is the case in areas of mathematics that deal with the exact description of languages and their meaning, such as formal language theory, mathematical logic and process algebra.

In informal discussions, the term often refers to the second sense, or to both. This is often the case when discussing programming languages. Efforts have been made to formalize these informal uses of the term [2].

The notion of expressive power is always relative to a particular kind of thing that the language in question can describe, and the term is normally used when comparing languages that describe the same kind of things, or at least comparable kinds of things.

The design of languages and formalisms involves a trade-off between expressive power and analyzability. The more a formalism can express, the harder it becomes to understand what instances of the formalism say. Decision problems become harder to answer or completely undecidable.

Formal language theory mostly studies formalisms to describe sets of strings, such as context-free grammars and regular expressions. Each instance of a formalism, e.g. each grammar and each regular expression, describes a particular set of strings. In this context, the expressive power of a formalism is the set of sets of strings its instances describe, and comparing expressive power is a matter of comparing these sets.

An important yardstick for describing the relative expressive power of formalisms in this area is the Chomsky hierarchy.

Expressive power means one quality attribute of programming language semantics

There are some ways to describe expressive power of each language term (e.g. clause).

1) Axiomatic semantics

  • Axiomatic semantics define the meaning of a command in a program by describing its effect on assertions about the program state. The assertions are logical statements – predicates with variables, where the variables define the state of the program.

2) Algebraic semantics

  • In logic, algebraic semantics is a formal semantics based on algebras.

3) Operational semantics

4) Denotational semantics

  • Denotational semantics is concerned with finding mathematical objects called domains that represent what programs do. For example, programs (or program phrases) might be represented by partial functions, or by Actor event diagram scenarios, or by games between the environment and the system

5) Action semantics

  • Action semantics is an approach that tries to modularize denotational semantics, splitting the formalization process in two layers (macro and microsemantics) and predefining three semantic entities (actions, data and yielders) to simplify the specification

6) WP-semantics / Predicate transformer semantics

  • Predicate transformer semantics is an extension of Floyd-Hoare Logic invented by Dijkstra and extended and refined by other researchers. It was first introduced in Dijkstra’s paper “Guarded commands, nondeterminacy and formal derivation of programs”. It is a method for defining the semantics of an imperative programming language by assigning to each command in the language a corresponding predicate transformer. A predicate transformer is a total function mapping between two predicates on the state space of a program.
  • The canonical predicate transformer of sequential imperative programming is the so-called “weakest preconditionwp(S,R). Here S denotes a list of commands and R denotes a predicate on the space, called the “postcondition“. The result of applying this function gives the “weakest pre-condition” for S terminating with R true. An example is the following definition of the assignment statement:
 wp(x := E, R)\ =\ R_E^x

This gives a predicate that is a copy of R with the value of x replaced by E.

  • Refinement calculus is a formalized approach to stepwise refinement for program construction. The required behaviour of the final executable program is specified as an abstract and perhaps non-executable “program”, which is then refined by a series of correctness-preserving transformations into an efficiently executable program.

7) Hoare’s axiomatic semantics e.g. Hoare logic

  • Hoare logic (also known as Floyd–Hoare logic or Hoare rules) is a formal system with a set of logical rules for reasoning rigorously about the correctness of computer programs. It was proposed in 1969 by the British computer scientist and logician C. A. R. Hoare, and subsequently refined by Hoare and other researchers. The original ideas were seeded by the work of Robert Floyd, who had published a similar system for flowcharts.
  • The central feature of Hoare logic is the Hoare triple. A triple describes how the execution of a piece of code changes the state of the computation. A Hoare triple is of the form
\{P\}\;C\;\{Q\}
where P and Q are assertions and C is a command. P is called the precondition and Q the postcondition: if the precondition is met, the command establishes the postcondition. Assertions are formulas in predicate logic.

8) Atomistic semantics

Atomistic semantics is a hybrid construction, whose expressive power comes from Prolog, a first order predicate logic extended for logic programming. Its type system is rich in expressing any axiomatic feature. The other side of that semantics is object-oriented. That makes it possible to define algebraic equations, formulas or any dependendies using object handles and their static counterparts, tree-based Prolog references, as the base.

There is no research to compare atomistic semantics with denotational semantics, but there are many comparisons between Lisp and Prolog. They are rather compatible to express complicated models and functions, even Lisp is more generic in expressing meta-features. The latter one is not important in expressing features from source code (when ignoring macros and precompilers).

Operational semantics is the hardest one, having two levels.

Below our summary as a picture.

Expressive power of a formalism

If two formalisms have equal power, then it is possible to express same kind of methods (working practices from the user point-of-view) using tools of the corresponding formalisms. In program comprehension (PC) there are multiple needs for the user to do: to define hypotheses, to learn structures, to navigate, to understand behavior, to prove correctness and to simulate and evaluate specific expressions. Therefore, for so many different purposes of program comprehension, atomistic semantics is a remarkable suggestion.

Some links:

Lets P be a program written in Java, C++ etc. Symbolic conversion for P is a translation, presented in Symbolic-language. See more Java-to-Symbolic.

Static version for a symbolic graph for the program P can be shown as a chain of with the following rules:

  • Nodes are command-names for each symbol (see AHO-architecture).
  • Edges are links between symbols (which are link-facts in the AHO-structures).
  • There is a call edge A..B for each semantic connection, wheres symbol A calls symbol B into execution.
  • There is a returning edge from B to A, which holds information, when that information is defined in the grammar of the original language (Java etc). It returns status-information when there is control command active in the existing sequence. Otherwise, the edge only returns to the caller, without any information.

Visualizing_Programs_as_Abstracted_Predicates

The graph above is an exact code model for Java, but it can be fitted for any language, even for Prolog and Lisp. However, non-imperative language keeps their execution mechanism, which cannot be drawn as simple as in figure above.

Symbol is a carrier for information, the base for all languages.

For formal languages there are two purposes for symbols:

  1. Syntax: They are thought of as purely syntactic structures, composed into larger structures using a formal grammar,
  2. Semantics: There are associated with an interpretation or model (a formal semantics), that define it in terms of other formal symbols.

In the symbolic atomistic model there is an artefact AHO (Atomistic Hybrid Object), that implements all features of each symbol, captured from the language.

Hence, AHO contains the syntactical reference, it holds the semantics and contains the execution part, the simulation mechanism.

By means of extending the internal symbolic language, which defines the semantics of each symbol, it is possible to build higher order logic to create verification and testing models based on lower level symbols. This approach creates a concept of a megamodel, which has some essential features of abstract interpretation and symbolic model checking.

Some links:

From Wiki: A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human communication.

A programming language is a notation for writing programs, which are specifications of a computation or algorithm.[1] Some, but not all, authors restrict the term “programming language” to those languages that can express all possible algorithms.

All Turing complete languages can implement the same set of algorithms.

Some links:

  1. Language
  2. Programming paradigms (Wiki)

Definition:

A formal language L over an alphabet Σ is just a subset of Σ*, that is, a set of words over that alphabet.

Discussion

Formal languages are entirely syntactic in nature but may be given semantics that give meaning to the elements of the language. Formal languages are useful in logic because they have formulas that can be interpreted as expressing logical truths. An interpretation of a formal language is the assignment of meanings to its symbols and formulas.

Model theory is the theory of interpretations of formal languages. The study of interpretations is called formal semantics. Giving an interpretation is synonymous with constructing a model. A model of a formula of a formal language is an interpretation of the language for which the formula comes out to be true.

Some links:

  1. Formal semantics
  2. Interpretation (logic)
  3. Model theory

From Wiki:

A language is a system for encoding and decoding information. In its most common use, the term refers to so-called “natural languages” — the forms of communication considered peculiar to humankind.

Mathematics and computer science use artificial entities called formal languages (including programming languages and markup languages, and some that are more theoretical in nature). These often take the form of character strings, produced by a combination of formal grammar and semantics of arbitrary complexity.

A programming language is a formal language endowed with semantics that can be used to control the behavior of a machine, particularly a computer, to perform specific tasks. Programming languages are defined using syntactic and semantic rules, to determine structure and meaning respectively.

Programming languages are used to facilitate communication about the task of organizing and manipulating information, and to express algorithms precisely. Some authors restrict the term “programming language” to those languages that can express all possible algorithms; sometimes the term “computer language” is used for artificial languages that are more limited.

Links:

  1. Formal languages.
  2. Programming languages.

Translation is a process to transform one formal input language into an output language. Translation engine is a tool or software component to implement the translation. One of the best known translation engines is TXL (Dean, 2002), which is based on rule based translation. It needs three different kinds of information:

  1. The type system of the input language (XIn)
  2. The type system of the output language (Xout)
  3. The transformation rule base (Xin2Xout)

The drawback of the rule based translation is that it contains three different grammars that should be mastered, because the rule base contains a language of its own. In practice making translations is demanding by using TXL for each input-output language pair.

Hence, direct translation is a process to carry information from an input language to an output language term by term using semantic predicate conversions.

In it we use the symbolic programming language of the tool, here Visual Prolog, as third language to carry information.  The principle is illustrated in FIGURE.

The input term has a format in(A1, Ak) and the output out(B1, Bk). If terms match then the conversion is trivially done by splitting subterms into sub-subterms. If they are not directly compatible the Prolog’s notation is used to make the necessary modifications.

Direct_translation.

Translation is a process to transform one formal input language into an output language. Translation engine is a tool or software component to implement the translation. One of the best known translation engines is TXL (Dean, 2002), which is based on rule based translation. It needs three different kinds of information:

The type system of the input language (XIn)

The type system of the output language (Xout)

The transformation rule base (Xin2Xout)

The drawback of the rule based translation is that it contains three different grammars that should be mastered, because the rule base contains a language of its own. In practice making translations is demanding by using TXL for each input-output language pair. A principle different to that of rule based translation is direct translation, which is presented in Definition 14. In this principle the symbolic programming language of the tool, here Visual Prolog, is used as third language. The principle is illustrated in FIGURE 9.

Some conversions are shown below.

TABLE             Conversion table between Java and Symbolic.

Some sample conversions Java Symbolic
Class class def(classDef(…))
Attribute/Field field def(attrDef(…))
Method method def(methodDef(…))
Statement statement depends on the Statement
Expression expression opClause
Reference reference refClause
Assignment statement setClause
Method invocation expression getClause
Class creator creator expression creatorClause

Symbolic is a symbolic code description language for abstracting Java and other programming languages. The definition for Symbolic is a tuple (Clause, Name, Type).

1 interface symbolic
2 domains
3      Clause definition:        clause = …     ..
4      Symbolic name:            symbolicName = ..
5      Symbolic type:            symbolicType = ..
6 end interface symbolic

The whole semantics of Symbolic structures is described by using the term clause.

Clause is the foundation of symbolic processing, because all tokens can be parsed and written as clauses.

In the following list the subtypes of clause and their meanings are listed:

clause =
1      Definitions:             def(defClause);
2      Creating commands:       creator(createClause);
3      References:              ref(refClause);
4      Method calls:            get(getClause);
5      Change clauses:          set(setClause);
6      Conditional clauses:     path(pathClause);
7      Loops:                   loop(loopClause);
8      Operations:              op(opClause);
9      Constants:               val(valClause);
10      Other clauses:           other(otherClause);
11      Internal links:          at(SymbolicElement, clause*);
12      Side effects:            seffect(sideEffectClause);
13      Meta information:        meta(metaClause);
14      Comments:                info(string)

Erkki Laitila, PhD (2008) computer engineer (1977)