Sample Code for pl/sql

PL/SQL (Procedural Language/Structured Query Language) is Oracle Corporation's proprietary procedural extension to the SQL database language, used in the Oracle database. Some other SQL database management systems offer similar extensions to the SQL language. PL/SQL's syntax strongly resembles that of Ada, and just like Ada compilers of the 1980s the PL/SQL runtime system uses Diana as intermediate representation.
The key strength of PL/SQL is its tight integration with the Oracle database.

History

PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2.
[edit]Functionality

PL/SQL supports the following : variables, conditions, arrays, and exceptions. Implementations from version 8 of Oracle Database onwards have included features associated with object-orientation.
The underlying SQL functions as a declarative language. Standard SQL—unlike some functional programming languages—does not require implementations to convert tail calls to jumps. The open standard SQL does not readily provide "first row" and "rest of table" accessors, and it cannot easily perform some constructs such as loops. PL/SQL, however, as a Turing-complete procedural language which fills in these gaps, allows Oracle database developers to interface with the underlying relational database in an imperative manner. SQL statements can make explicit in-line calls to PL/SQL functions, or can cause PL/SQL triggers to fire upon pre-defined Data Manipulation Language (DML) events.
PL/SQL stored procedures (functions, procedures, packages, and triggers) which perform DML get compiled into an Oracle database: to this extent their SQL code can undergo syntax-checking. Programmers working in an Oracle database environment can construct PL/SQL blocks of such functionality to serve as procedures, functions; or they can write in-line segments of PL/SQL within SQL*Plus scripts.
While programmers can readily incorporate SQL DML statements into PL/SQL (as cursor definitions, for example, or using the SELECT ... INTO syntax), Data Definition Language (DDL) statements such as CREATE TABLE/DROP INDEX etc require the use of "Dynamic SQL". Earlier versions of Oracle Database required the use of a complex built-in DBMS_SQL package for Dynamic SQL where the system needed to explicitly parse and execute an SQL statement. Later versions have included an EXECUTE IMMEDIATE syntax called "Native Dynamic SQL" which considerably simplifies matters. Any use of DDL in an Oracle database will result in an implicit commit. Programmers can also use Dynamic SQL to execute DML where they do not know the exact content of the statement in advance.
PL/SQL offers several pre-defined packages for specific purposes. Such PL/SQL packages include:
DBMS_OUTPUT - for output operations to non-database destinations
DBMS_JOB - for running specific procedures/functions at a particular time (i.e. scheduling)
DBMS_XPLAN - for formatting "Explain Plan" output
DBMS_SESSION - provides access to SQL ALTER SESSION and SET ROLE statements, and other session information.
DBMS_METADATA - for extracting meta data from the data dictionary (such as DDL statements)
UTL_FILE - for reading and writing files on disk
UTL_HTTP - for making requests to web servers from the database
UTL_SMTP - for sending mail from the database (via an SMTP server)
Oracle Corporation customarily adds more packages and/or extends package functionality with each successive release of Oracle Database.
[edit]Basic code structure

PL/SQL programs consist of procedures, functions, and anonymous blocks. Each of these is made up of the basic PL/SQL unit which is the block. Blocks take the general form:
DECLARE
-- Declaration block (optional)
BEGIN
-- Program proper
EXCEPTION
-- Exception-handling (optional)
END
/* Sample comment spanning
multiple lines... */
Note that blocks can be nested within blocks.
The DECLARE section specifies the datatypes of variables, constants, collections, and user-defined types.
The block between BEGIN and END specifies executable procedural code.
Exceptions, errors which arise during the execution of the code, have one of two types:
pre-defined exceptions
user-defined exceptions.
Programmers have to raise user-defined exceptions explicitly. They can do this by using the RAISE command, with the syntax:
RAISE
Oracle Corporation has pre-defined several exceptions like NO_DATA_FOUND, TOO_MANY_ROWS, etc. Each exception has a SQL Error Number and SQL Error Message associated with it. Programmers can access these by using the SQLCODE and SQLERRM functions.
The DECLARE section defines and (optionally) initialises variables. If not initialised specifically they default to NULL.
For example:
DECLARE
number1 NUMBER(2);
number2 NUMBER(2) := 17;
text1 VARCHAR2(12) := 'Hello world';
text2 DATE := SYSDATE; -- current date and time
BEGIN
SELECT street_number
INTO number1
FROM address
WHERE name = 'Smith';
END;
The symbol := functions as an assignment operator to store a value in a variable.
The major datatypes in PL/SQL include NUMBER, INTEGER, CHAR, VARCHAR2, DATE, TIMESTAMP, TEXT etc.
[edit]Functions
Functions in PL/SQL are a collection of SQL and PL/SQL statements that perform a task and should return a value to the calling environment.
CREATE OR REPLACE FUNCTION
IS/AS
{Variable declaration}
{CONSTANT declaration}
RETURN return_type

BEGIN

Pl/SQL Block;

EXCEPTION

EXCEPTION Block;

END;
[edit]Procedures
To Be Decided
[edit]Anonymous Blocks
Anonymous PL/SQL blocks can be embedded in an Oracle Precompiler or OCI program. At run time, the program, lacking a local PL/SQL engine, sends these blocks to the Oracle server, where they are compiled and executed. Likewise, interactive tools such as SQL*Plus and Enterprise Manager, lacking a local PL/SQL engine, must send anonymous blocks to Oracle.
[edit]Packages
To Be Decided
[edit]Numeric variables
variable_name number(P[,S]) := value;
To define a numeric variable, the programmer appends the variable type NUMBER to the name definition. To specify the (optional) precision(P) and the (optional) scale (S), one can further append these in round brackets, separated by a comma. ("Precision" in this context refers to the number of digits which the variable can hold, "scale" refers to the number of digits which can follow the decimal point.)
A selection of other datatypes for numeric variables would include:
binary_float, binary_double, dec, decimal, double precision, float, integer, int, numeric, real, smallint, binary_integer
[edit]Character variables
variable_name varchar2(L) := 'Text';
To define a character variable, the programmer normally appends the variable type VARCHAR2 to the name definition. There follows in brackets the maximum number of characters which the variable can store.
Other datatypes for character variables include:
varchar, char, long, raw, long raw, nchar, nchar2, clob, blob, bfile
[edit]Date variables
variable_name date := '01-Jan-2005';
Oracle provides a number of data types that can store dates (DATE, DATETIME, TIMESTAMP etc), however DATE is most commonly used.
Programmers define date variables by appending the datatype code "DATE" to a variable name. The TO_DATE function can be used to convert strings to date values. The function converts the first quoted string into a date, using as a definition the second quoted string, for example:
TO_DATE('31-12-2004','dd-mm-yyyy')
or
TO_DATE ('31-Dec-2004','dd-mon-yyyy', 'NLS_DATE_LANGUAGE = American')
To convert the dates to strings one uses the function TO_CHAR (date_string, format_string).
[edit]Datatypes for specific columns
Variable_name Table_name.Column_name%type;
This syntax defines a variable of the type of the referenced column on the referenced table.
Programmers specify user-defined datatypes with the syntax:
type data_type is record (field_1 type_1 :=xyz, field_2 type_2 :=xyz, ..., field_n type_n :=xyz);
For example:
DECLARE
TYPE t_address IS RECORD (
name address.name%TYPE,
street address.street%TYPE,
street_number address.street_number%TYPE,
postcode address.postcode%TYPE);
v_address t_address;
BEGIN
SELECT name, street, street_number, postcode INTO v_address FROM address WHERE ROWNUM = 1;
END;
This sample program defines its own datatype, called t_address, which contains the fields name, street, street_number and postcode.
Using this datatype the programmer has defined a variable called v_address and loaded it with data from the ADDRESS table.
Programmers can address individual attributes in such a structure by means of the dot-notation, thus: "v_address.street := 'High Street';"
[edit]Conditional Statements

The following code segment shows the IF-THEN-ELSIF construct. The ELSIF and ELSE parts are optional so it is possible to create simpler IF-THEN or, IF-THEN-ELSE constructs.
IF x = 1 THEN
sequence_of_statements_1;
ELSIF x = 2 THEN
sequence_of_statements_2;
ELSIF x = 3 THEN
sequence_of_statements_3;
ELSIF x = 4 THEN
sequence_of_statements_4;
ELSIF x = 5 THEN
sequence_of_statements_5;
ELSE
sequence_of_statements_N;
END IF;
The CASE statement simplifies some large IF-THEN-ELSE structures.
CASE
WHEN x = 1 THEN sequence_of_statements_1;
WHEN x = 2 THEN sequence_of_statements_2;
WHEN x = 3 THEN sequence_of_statements_3;
WHEN x = 4 THEN sequence_of_statements_4;
WHEN x = 5 THEN sequence_of_statements_5;
ELSE sequence_of_statements_N;
END CASE;
CASE statement can be used with predefined selector:
CASE x
WHEN 1 THEN sequence_of_statements_1;
WHEN 2 THEN sequence_of_statements_2;
WHEN 3 THEN sequence_of_statements_3;
WHEN 4 THEN sequence_of_statements_4;
WHEN 5 THEN sequence_of_statements_5;
ELSE sequence_of_statements_N;
END CASE;
[edit]Array handling

PL/SQL refers to arrays as "collections". The language offers three types of collections:
Index-by tables (associative arrays)
Nested tables
Varrays (variable-size arrays)
Programmers must specify an upper limit for varrays, but need not for index-by tables or for nested tables. The language includes several collection methods used to manipulate collection elements: for example FIRST, LAST, NEXT, PRIOR, EXTEND, TRIM, DELETE, etc. Index-by tables can be used to simulate associative arrays, as in this example of a memo function for Ackermann's function in PL/SQL.
[edit]Looping

As a procedural language by definition, PL/SQL provides several iteration constructs, including basic LOOP statements, WHILE loops, FOR loops, and Cursor FOR loops.
[edit]LOOP statements
Syntax:
LOOP
statement1;
statement2;
END LOOP;
Loops can be terminated by using the EXIT keyword, or by raising an exception.
[edit]WHILE loops
Syntax:
WHILE condition LOOP
...do something...
END LOOP;
[edit]FOR loops
FOR loops, also called "numerical loops", operate a certain (counted) number of times.
FOR IN [REVERSE] .. LOOP
....
.....
END LOOP;
The REVERSE keyword implements looping in reverse order.
[edit]Cursor FOR loops
FOR RecordIndex IN (SELECT person_code FROM people_table)
LOOP
DBMS_OUTPUT.PUT_LINE(RecordIndex.person_code);
END LOOP;
Cursor-for loops automatically open a cursor, read in their data and close the cursor again
As an alternative, the PL/SQL programmer can pre-define the cursor's SELECT-statement in advance in order (for example) to allow re-use or to make the code more understandable (especially useful in the case of long or complex queries).
DECLARE
CURSOR cursor_person IS
SELECT person_code FROM people_table;
BEGIN
FOR RecordIndex IN cursor_person
LOOP
DBMS_OUTPUT.PUT_LINE(RecordIndex.person_code);
END LOOP;
END;
The concept of the person_code within the FOR-loop gets expressed with dot-notation ("."):
RecordIndex.person_code
[edit]Example
DECLARE
var NUMBER; /* this "var" is not in the same scope as the for loop "var"
a reference to "var" after the "end loop;" would find its
value to be null */
BEGIN
/*N.B. for loop variables in pl/sql are new declarations, with scope only inside the loop */
FOR var IN 0 ..10 LOOP
DBMS_OUTPUT.put_line(var);
END LOOP;
END;
Output:
0
1
2
3
4
5
6
7
8
9
10

[edit]Similar languages

PL/SQL functions analogously to the embedded procedural languages associated with other relational databases. Sybase ASE and Microsoft SQL Server have Transact-SQL, PostgreSQL has PL/pgSQL (which tries to emulate PL/SQL to an extent), and IBM DB2 includes SQL Procedural Language,[1] which conforms to the ISO SQL’s SQL/PSM standard.
The designers of PL/SQL modelled its syntax on that of Ada. Both Ada and PL/SQL have Pascal as a common ancestor, and so PL/SQL also resembles Pascal in numerous aspects. The structure of a PL/SQL package closely resembles the basic Pascal's program structure, or a Borland Delphi unit. Programmers can define global data-types, constants and static variables, public and private, in a PL/SQL package.
PL/SQL also allows for the definition of classes and instantiating these as objects in PL/SQL code. This resembles usages in object-oriented programming languages like Object Pascal, C++ and Java. PL/SQL refers to a class as an "Advanced Data Type" (ADT), and defines it as an Oracle SQL data-type as opposed to a PL/SQL user-defined type, allowing its use in both the Oracle SQL Engine and the Oracle PL/SQL engine. The constructor and methods of an Advanced Data Type are written in PL/SQL. The resulting Advanced Data Type can operate as an object class in PL/SQL. Such objects can also persist as column values in Oracle database tables.
PL/SQL does not resemble Transact-SQL, despite superficial similarities due to the use of both as embedded database languages. Porting code from one to the other usually involves non-trivial work, not only due to the differences in the feature sets of the two languages, but also due to the very significant differences in the way Oracle and SQL Server deal with concurrency and locking.

SQL

SQL is short for Structured Query Language and is a widely used database language, providing means of data manipulation (store, retrieve, update, delete) and database creation.

Almost all modern Relational Database Management Systems like MS SQL Server, Microsoft Access, MSDE, Oracle, DB2, Sybase, MySQL, Postgres and Informix use SQL as standard database language. Now a word of warning here, although all those RDBMS use SQL, they use different SQL dialects. For example MS SQL Server specific version of the SQL is called T-SQL, Oracle version of SQL is called PL/SQL, MS Access version of SQL is called JET SQL, etc.

Our SQL tutorial will teach you how to use commonly used SQL commands and you will be able to apply most of the knowledge gathered from this SQL tutorial to any of the databases above.

SQL Tutorial Table of Contents

SQL Tutorial
Learn what SQL (Structured Query Language) is, and where and how it is used.

SQL Table
SQL Database Tables are the foundation of every RDBMS (Relational Database Management System). Learn more about SQL tables here.

SQL SELECT
Learn how to use the SELECT SQL statement to retrieve data from a SQL database table.

SQL SELECT INTO
Learn how to use the SQL SELECT INTO statement to copy data between database tables.

SQL DISTINCT
Learn how to use the SQL DISTINCT clause together with the SQL SELECT keyword, to return a dataset with unique entries for certain database table column.

SQL WHERE
The SQL WHERE command is used to specify selection criteria, thus restricting the result of a SQL query.

SQL LIKE
The SQL LIKE clause is used along with the SQL WHERE clause and specifies criteria based on a string pattern.

SQL INSERT INTO
Learn how to use the SQL INSERT INTO clause to insert data into a SQL database table.

SQL UPDATE
Learn how to use the SQL UPDATE statement to update data in a SQL database table.

SQL DELETE
Learn how to use the SQL DELETE statement to delete data from a SQL database table.

SQL ORDER BY
Learn how to use the SQL ORDER BY statement to sort the data retrieved in your SQL query.

SQL OR & AND
Learn how to use the SQL OR & AND keywords together with the SQL WHERE clause to add several conditions to your SQL statement.

SQL IN
The SQL IN clause allows you to specify discrete values in your SQL WHERE search criteria.

SQL BETWEEN
The SQL BETWEEN & AND keywords define a range of data between 2 values.

SQL Aliases
SQL aliases can be used with database tables and/or with database table columns, depending on task you are performing.

SQL COUNT
The SQL COUNT aggregate function is used to count the number of rows in a database table.

SQL MAX
The SQL MAX aggregate function allows us to select the highest (maximum) value for a certain column.

SQL MIN
The SQL MIN aggregate function allows us to select the lowest (minimum) value for a certain column.

SQL AVG
The SQL AVG aggregate function selects the average value for a certain table column.

SQL SUM
The SQL SUM aggregate function allows selecting the total for a numeric column.

SQL GROUP BY
The SQL GROUP BY statement is used along with the SQL aggregate functions like SUM to provide means of grouping the result dataset by certain database table column(s).

SQL HAVING
The SQL HAVING clause is used to restrict conditionally the output of a SQL statement, by a SQL aggregate function used in your SELECT list of columns.

SQL JOIN
The SQL JOIN clause is used whenever we have to select data from 2 or more tables.

C#

C#
Paradigm structured, imperative, object-oriented.
Appeared in 2001
Designed by Microsoft Corporation
Latest release 3/ 19 November 2007
Typing discipline static, strong, both safe and unsafe, nominative
Major implementations .NET Framework, Mono, DotGNU
Influenced by Object Pascal, C++, Modula-3, Java, Eiffel
Influenced F#, Nemerle, D, Java[1], Vala, Windows PowerShell
C# (pronounced C Sharp) is a multi-paradigm programming language that encompasses functional, imperative, generic and object-oriented (class-based) programming disciplines. It is developed by Microsoft as part of the .NET initiative and later approved as a standard by ECMA (ECMA-334) and ISO (ISO/IEC 23270). Anders Hejlsberg, the designer of Delphi, leads development of the C# language, which has an object-oriented syntax based on C++ and includes influences from aspects of several other programming languages (most notably Delphi and Java) with a particular emphasis on simplification.

History

In 1996, Sun Microsystems released the Java programming language, for which Microsoft purchased a license to implement Java in their operating system. Java was originally meant to be a platform independent language, but Microsoft, in their implementation, broke their license agreement and made a few changes that would essentially inhibit Java's platform-independent capabilities. Sun filed a lawsuit and Microsoft settled, deciding to create their own version of a partially compiled, partially interpreted object-oriented programming language with syntax closely related to that of C++.
During the development of .NET, the class libraries were originally written in a language/compiler called Simple Managed C (SMC).[2][3][4] In January 1999, Anders Hejlsberg formed a team to build a new language at the time called Cool, which stood for "C like Object Oriented Language".[5] Microsoft had considered keeping the name "Cool" as the final name of the language, but chose not to do so for trademark reasons. By the time the .NET project was publicly announced at the July 2000 Professional Developers Conference, the language had been renamed C#, and the class libraries and ASP.NET runtime had been ported to C#.
C#'s principal designer and lead architect at Microsoft is Anders Hejlsberg, who was previously involved with the design of Visual J++, Borland Delphi, and Turbo Pascal. In interviews and technical papers he has stated that flaws in most major programming languages (e.g. C++, Java, Delphi, and Smalltalk) drove the fundamentals of the Common Language Runtime (CLR), which, in turn, drove the design of the C# programming language itself. Some argue that C# shares roots in other languages.[6]
[edit]Features

Note: The following description is based on the language standard and other documents listed in the external links section.
By design, C# is the programming language that most directly reflects the underlying Common Language Infrastructure (CLI). Most of C#'s intrinsic types correspond to value-types implemented by the CLI framework. However, the C# language specification does not state the code generation requirements of the compiler: that is, it does not state that a C# compiler must target a Common Language Runtime (CLR), or generate Common Intermediate Language (CIL), or generate any other specific format. Theoretically, a C# compiler could generate machine code like traditional compilers of C++ or FORTRAN; in practice, all existing C# implementations target CIL.
C# differs from C and C++ as much as it resembles Java, including:
There are no global variables or functions. All methods and members must be declared within classes. It is possible, however, to use static methods/variables within public classes instead of global variables/functions.
Local variables cannot shadow variables of the enclosing block, unlike C and C++. Variable shadowing is often considered confusing by C++ texts.
C# supports a strict boolean type, bool. Statements that take conditions, such as while and if, require an expression of a boolean type. While C++ also has a boolean type, it can be freely converted to and from integers, and expressions such as if(a) require only that a is convertible to bool, allowing a to be an int, or a pointer. C# disallows this "integer meaning true or false" approach on the grounds that forcing programmers to use expressions that return exactly bool can prevent certain types of programming mistakes such as if (a = b) (use of = instead of ==).
In C#, memory address pointers can only be used within blocks specifically marked as unsafe, and programs with unsafe code need appropriate permissions to run. Most object access is done through safe references, which cannot be made invalid. An unsafe pointer can point to an instance of a value-type, array, string, or a block of memory allocated on a stack. Code that is not marked as unsafe can still store and manipulate pointers through the System.IntPtr type, but cannot dereference them.
Managed memory cannot be explicitly freed, but is automatically garbage collected. Garbage collection addresses memory leaks. C# also provides direct support for deterministic finalization with the using statement (supporting the Resource Acquisition Is Initialization idiom).
Multiple inheritance is not supported, although a class can implement any number of interfaces. This was a design decision by the language's lead architect to avoid complication, avoid dependency hell and simplify architectural requirements throughout CLI.
C# is more typesafe than C++. The only implicit conversions by default are those which are considered safe, such as widening of integers and conversion from a derived type to a base type. This is enforced at compile-time, during JIT, and, in some cases, at runtime. There are no implicit conversions between booleans and integers and between enumeration members and integers (except 0, which can be implicitly converted to an enumerated type), and any user-defined conversion must be explicitly marked as explicit or implicit, unlike C++ copy constructors (which are implicit by default) and conversion operators (which are always implicit).
Enumeration members are placed in their own namespace.
Accessors called properties can be used to modify an object with syntax that resembles C++ member field access. In C++, declaring a member public enables both reading and writing to that member, and accessor methods must be used if more fine-grained control is needed. In C#, properties allow control over member access and data validation.
Full type reflection and discovery is available.
C# currently (as of 3 June 2008) has 77 reserved words.
[edit]Common Type system (CTS)

C# has a unified type system. This unified type system is called Common Type System (CTS).
A unified type system implies that all types, including primitives such as integers, are subclasses of the System.Object class. For example, every type inherits a ToString() method. For performance reasons, primitive types (and value types in general) are internally allocated on the stack.
[edit]Categories of datatypes
CTS separates datatypes into two categories:
Value Type
Reference Type
While value types are those in which the value itself is stored by allocating memory on the stack, reference types are those in which only the address to the location where the value is present, is stored. Value types include integers (short, long), floating-point numbers (float, double), decimal (a base 10 number), structures, enumerations, booleans and characters while reference types include objects, strings, classes, interfaces and delegates.
[edit]User-defined datatypes
C# also allows the programmer to create user-defined value types, using the struct keyword. From the programmer's perspective, they can be seen as lightweight classes. Unlike regular classes, and like the standard primitives, such value types are allocated on the stack rather than on the heap. They can also be part of an object (either as a field or boxed), or stored in an array, without the memory indirection that normally exists for class types. Structs also come with a number of limitations. Because structs have no notion of a null value and can be used in arrays without initialization, they are implicitly initialized to default values (normally by filling the struct memory space with zeroes, but the programmer can specify explicit default values to override this). The programmer can define additional constructors with one or more arguments. This also means that structs lack a virtual method table, and because of that (and the fixed memory footprint), they cannot allow inheritance (but can implement interfaces).
[edit]Type casting in C#
Type casting is the process of converting a value belonging to a particular data type (or instance) to another.
Example:
using System;

class Employee { }

class ContractEmployee : Employee
{
}

class CastExample5
{
public static void Main ()
{
Employee e = new Employee();
Console.WriteLine("e = {0}",
e == null ? "null" : e.ToString());

ContractEmployee c = e as ContractEmployee;
Console.WriteLine("c = {0}",
c == null ? "null" : c.ToString());
}
}
Here, the element e, which is an instance of the class Employee, is type cast as an instance of the class ContractEmployee and stored in element c.
Certain datatypes are incompatible for type casting. For example, an integer value cannot be type-casted into a string though the converse is possible. With regard to user-defined data types, the compiler allows all kinds of type-casting. However, an InvalidCastException is thrown at runtime if the datatypes are incompatible.
[edit]Boxing and unboxing
Boxing and unboxing are two new concepts introduced in C#.
Boxing is the method used to convert a value type into a reference type.
Example:
int foo = 42; // Value type...
object bar = foo; // foo is boxed to bar.
Unboxing is the method used to convert a reference type into a value type.
Example:
int foo = 42; // Value type.
object bar = foo; // foo is boxed to bar.
int foo2 = (int)bar; // Unboxed back to value type.
Boxing and unboxing become important when value types are put into a collection class or taken out of a collection class.
[edit]Features of C# 2.0

New features in C# for the .NET SDK 2.0 (corresponding to the 3rd edition of the ECMA-334 standard) are:
[edit]Partial class
Partial classes allow class implementation across more than one source file. This permits splitting up very large classes, and is also useful if some parts of a class are automatically generated.
file1.cs:
public partial class MyClass
{
public void MyMethod1()
{
// implementation
}
}
file2.cs:
public partial class MyClass
{
public void MyMethod2()
{
// implementation
}
}
[edit]Generics
Generics, or parameterized types, is a .NET 2.0 feature supported by C#. Unlike C++ templates, .NET parameterized types are instantiated at runtime rather than by the compiler; hence they can be cross-language whereas C++ templates cannot. They support some features not supported directly by C++ templates such as type constraints on generic parameters by use of interfaces. On the other hand, C# does not support non-type generic parameters. Unlike generics in Java, .NET generics use reification to make parameterized types first-class objects in the CLI Virtual Machine, which allows for optimizations and preservation of the type information.[7]
[edit]Static classes that cannot be instantiated
Static classes that cannot be instantiated, and that only allow static members. This is similar to the concept of module in many procedural languages.
[edit]A new form of iterator providing generator functionality
A new form of iterator that provides generator functionality, using a yield return construct similar to yield in Python.
// Method that takes an iterable input (possibly an array)
// and returns all even numbers.
public static IEnumerable GetEven(IEnumerable numbers)
{
foreach (int i in numbers)
{
if (i % 2 == 0) yield return i;
}
}
[edit]Anonymous delegates
Anonymous delegates providing closure-like functionality.[8]
public void Foo(object parameter) {
// ...

ThreadPool.QueueUserWorkItem(delegate
{
// anonymous delegates have full access to local variables of the enclosing method
if (parameter == ...)
{
// ...
}

// ...
});
}
[edit]Covariance and contravariance for signatures of delegates
Covariance and contravariance for signatures of delegates[9]
[edit]The accessibility of property accessors can be set independently
Example:
string status = string.Empty;

public string Status
{
get { return status; } // anyone can get value of this property,
protected set { status = value; } // but only derived classes can change it
}
[edit]Nullable types
Nullable value types (denoted by a question mark, e.g. int? i = null;) which add null to the set of allowed values for any value type. This provides improved interaction with SQL databases, which can have nullable columns of types corresponding to C# primitive types: an SQL INTEGER NULL column type directly translates to the C# int?.
Nullable types received an eleventh-hour improvement at the end of August 2005, mere weeks before the official launch, to improve their boxing characteristics: a nullable variable which is assigned null is not actually a null reference, but rather an instance of struct Nullable with property HasValue equal to false. When boxed, the Nullable instance itself is boxed, and not the value stored in it, so the resulting reference would always be non-null, even for null values. The following code illustrates the corrected flaw:
int? i = null;
object o = i;
if (o == null)
Console.WriteLine("Correct behaviour - runtime version from September 2005 or later");
else
Console.WriteLine("Incorrect behaviour - pre-release runtime (from before September 2005)");
When copied into objects, the official release boxes values from Nullable instances, so null values and null references are considered equal. The late nature of this fix caused some controversy[citation needed], since it required core-CLR changes affecting not only .NET2, but all dependent technologies (including C#, VB, SQL Server 2005 and Visual Studio 2005).
[edit]Coalesce operator
(??) returns the first of its operands which is not null (or null, if no such operand exists):
object nullObj = null;
object obj = new Object();
return nullObj ?? obj; // returns obj
The primary use of this operator is to assign a nullable type to a non-nullable type with an easy syntax:
int? i = null;
int j = i ?? 0; // Unless i is null, initialize j to i. Else (if i is null), initialize j to 0.
[edit]Features of C# 3.0

C# 3.0 is the current version, and was released on 19 November 2007 as part of .NET Framework 3.5. It includes new features inspired by functional programming languages such as Haskell and ML, and is driven largely by the introduction of the Language Integrated Query (LINQ) pattern to the Common Language Runtime.[10]
[edit]Linq
Language Integrated Query:[11] "from, where, select" context-sensitive keywords allowing queries across SQL, XML, collections, and more. These are treated as keywords in the LINQ context, but their addition won't break existing variables named from, where, or select.
[edit]Object initializers
Customer c = new Customer(); c.Name = "James";
can be written
Customer c = new Customer { Name="James" };
[edit]Collection initializers
MyList list = new MyList(); list.Add(1); list.Add(2); can be written as MyList list = new MyList { 1, 2 }; (assuming that MyList implements System.Collections.IEnumerable and has a public Add method[12])
[edit]Anonymous types
var x = new { Name = "James" }
[edit]Local variable type inference
Local variable type inference:
var x = new Dictionary>();
is interchangeable with
Dictionary> x = new Dictionary>();
More than just syntactic sugar, this feature is required for the declaration of anonymous typed variables. Also it simplifies refactoring.
[edit]Lambda expressions
Lambda expressions: listOfFoo.Where(delegate(Foo x) { return x.Size > 10; }) can be written listOfFoo.Where(x => x.Size > 10);
Compiler-inferred translation of Lambda expressions to either strongly-typed function delegates or strongly-typed expression trees.
[edit]Automatic properties
The compiler will automatically generate a private instance variable and the appropriate getter and setter given code such as: public string Name { get; private set; }
[edit]Extension methods
Extension methods (adding methods to classes by including the this keyword in the first parameter of a method on another static class):
public static class IntExtensions
{
public static void PrintPlusOne(this int x) { Console.WriteLine(x + 1); }
}

int foo = 0;
foo.PrintPlusOne();

ASP.NET

Microsoft ASP.NET is a free technology that allows programmers to create dynamic web applications. ASP.NET can be used to create anything from small, personal websites through to large, enterprise-class web applications. All you need to get started with ASP.NET is the free .NET Framework and the free Visual Web Developer. Get the Essential Downloads, and start today.



Microsoft Asp.net

Performance

ASP.NET aims for the overall performance benefits value over other value based script-based technologies (including Classic ASP) by compiling the server-side code to one or more DLL files on the web server.[1] This compilation happens automatically the first time a page is requested (which means the developer need not perform a separate compilation step for pages). This feature provides the ease of development offered by scripting languages with the performance benefits of a compiled binary. However, the compilation might cause a noticeable delay to the web user when the newly-edited page is first requested from the web server.
The ASPX and other resource files are placed in a virtual host on an Internet Information Services server (or other compatible ASP.NET servers; see Other Implementations, below). The first time a client requests a page, the .NET framework parses and compiles the file(s) into a .NET assembly and sends the response; subsequent requests are served from the DLL files. By default ASP.NET will compile the entire site in batches of 1000 files upon first request. If the compilation delay is causing problems, the batch size or the compilation strategy may be tweaked.
Developers can also choose to pre-compile their code before deployment, eliminating the need for just-in-time compilation in a production environment.
[edit]Extension

Microsoft has released some extension frameworks that plug into ASP.NET and extend its functionality. Some of them are:
ASP.NET AJAX
An extension with both client-side as well as server-side components for writing ASP.NET pages that incorporate AJAX functionality.
ASP.NET MVC Framework
An extension to author ASP.NET pages using the MVC architecture.
[edit]ASP.NET compared to ASP Classic

ASP.NET attempts to simplify developers' transition from Windows application development to web development by offering the ability to build pages composed of controls similar to a Windows user interface. A web control, such as a button or label, functions in very much the same way as its Windows counterpart: code can assign its properties and respond to its events. Controls know how to render themselves: whereas Windows controls draw themselves to the screen, web controls produce segments of HTML and JavaScript which form part of the resulting page sent to the end-user's browser.
ASP.NET encourages the programmer to develop applications using an event-driven GUI model, rather than in conventional web-scripting environments like ASP and PHP. The framework attempts to combine existing technologies such as JavaScript with internal components like "ViewState" to bring persistent (inter-request) state to the inherently stateless web environment.
Other differences compared to ASP classic are:
Compiled code means applications run faster with more design-time errors trapped at the development stage.
Significantly improved run-time error handling, making use of exception handling using try-catch blocks.
Similar metaphors to Microsoft Windows applications such as controls and events.
An extensive set of controls and class libraries allows the rapid building of applications, plus user-defined controls allow commonly-used web template, such as menus. Layout of these controls on a page is easier because most of it can be done visually in most editors.
ASP.NET leverages the multi-language capabilities of the .NET Common Language Runtime, allowing web pages to be coded in VB.NET, C#, J#, Delphi.NET, Chrome etc.
Ability to cache the whole page or just parts of it to improve performance.
Ability to use the code-behind development model to separate business logic from presentation.
If an ASP.NET application leaks memory, the ASP.NET runtime unloads the AppDomain hosting the erring application and reloads the application in a new AppDomain.
Session state in ASP.NET can be saved in a Microsoft SQL Server database or in a separate process running on the same machine as the web server or on a different machine. That way session values are not lost when the web server is reset or the ASP.NET worker process is recycled.
Versions of ASP.NET prior to 2.0 were criticized for their lack of standards compliance. The generated HTML and JavaScript sent to the client browser would not always validate against W3C/ECMA standards. In addition, the framework's browser detection feature sometimes incorrectly identified web browsers other than Microsoft's own Internet Explorer as "downlevel" and returned HTML/JavaScript to these clients with some of the features removed, or sometimes crippled or broken. However, in version 2.0, all controls generate valid HTML 4.0, XHTML 1.0 (the default) or XHTML 1.1 output, depending on the site configuration. Detection of standards-compliant web browsers is more robust and support for Cascading Style Sheets is more extensive.
Web Server Controls: these are controls introduced by ASP.NET for providing the UI for the web form. These controls are state managed controls and are WYSIWYG controls.
[edit]Criticism

On IIS 6.0 and lower, pages written using different versions of the ASP framework can't share Session State without the use of third-party libraries. This criticism does not apply to ASP.NET and ASP applications running side by side on IIS 7. With IIS 7, modules may be run in an integrated pipeline that allows modules written in any language to be executed for any request.[2][citation needed]
ASP.NET 2.0 Web Forms produces markup that passes W3C validation, but it is debatable as to whether this increases accessibility, one of the benefits of a semantic XHTML page + CSS representation. Several controls, such as the Login controls and the Wizard control, use HTML tables for layout by default. Microsoft has solved this problem by releasing the ASP.NET 2.0 CSS Control Adapters, a free add-on that produces compliant accessible XHTML+CSS markup.

Ajax

Enter JavaScript

Through the use of JavaScript, a reasonable amount of logic can be added to an HTML page in order to give timely feedback to user interactions. This has some major drawbacks, however. The first problem is that, as the JavaScript has been delivered to the browser along with the page, that logic has been opened up to interrogation. This might be fine for checking the format of an email address but would be no good for something like our serial number example, as the exposure of the method of verifying that input would compromise the integrity of the serial number mechanism.

The second problem with including any serious logic within the page is that the interface layer is simply not the place for serious logic. This belongs in the application layer, which is way back at the server. The problem is compounded by the fact that JavaScript cannot usually be relied upon to be available at the client. Whilst the majority of users are able and willing to run JavaScript in their browser, a considerable number prefer not to, or browse with a device where JavaScript is either unavailable or makes no sense. Therefore, any logic operations performed with JavaScript at the client must be verified at the server in case the operation never occurred.

The XMLHttpRequest Object

A solution to these problem presents itself in the form of the XMLHttpRequest object. This object, first implemented by Microsoft as an ActiveX object but now also available as a native object within both Mozilla and Apple's Safari browser, enables JavaScript to make HTTP requests to a remote server without the need to reload the page. In essence, HTTP requests can be made and responses received, completely in the background and without the user experiencing any visual interruptions.

This is a tremendous boon, as it takes the developer a long way towards achieving the goals of both a responsive user interface and keeping all the important logic in the application layer. By using JavaScript to ferry input back to the server in real time, the logic can be performed on the server and the response returned for near-instant feedback.

The Basics

Due to its history, and not yet being embodied in any public standard (although something similar is in the works for the proposed W3C DOM Level 3 Load and Save spec), there are two distinct methods for instantiating an XMLHttpRequest object. For Internet Explorer, an ActiveX object is used:

var req = new ActiveXObject("Microsoft.XMLHTTP");
For Mozilla and Safari, it's just a native object:

var req = new XMLHttpRequest();
Clearly, as a result of this inconsistency, it's necessary to fork your code based on support for the appropriate object. Whilst there are a number of methods for doing this (including inelegant browser hacks and conditional comment mechanisms), I believe it's best to simply test for support of either object. A good example of this can be found in Apple's developer documentation on the subject. Let's take their example:

var req;

function loadXMLDoc(url)
{
// branch for native XMLHttpRequest object
if (window.XMLHttpRequest) {
req = new XMLHttpRequest();
req.onreadystatechange = processReqChange;
req.open("GET", url, true);
req.send(null);
// branch for IE/Windows ActiveX version
} else if (window.ActiveXObject) {
req = new ActiveXObject("Microsoft.XMLHTTP");
if (req) {
req.onreadystatechange = processReqChange;
req.open("GET", url, true);
req.send();
}
}
}
A particularly important property to note is the onreadystatechange property. Note how it is assigned to a function processReqChange. This property is an event handler which is triggered whenever the state of the request changes. The states run from zero (uninitialized) through to four (complete). This is important because our script isn't going to wait for the response before continuing. The HTTP shenanigans are initiated, but then they carry on out of process whilst the rest of the script runs. Due to this, it's not as simple as having loadXMLDoc return the result of the request at the end of the function, because we don't know if we'll have a response by then or not. By having the function processReqChange check for the state changing, we can tell when the process has finished and carry on only if it has been successful.

With this in mind, a skeleton processReqChange function needs to check for two things. The first is the state changing to a value of 4, indicating the process complete. The second is to check the HTTP status code. You'll be familiar with common status codes like 404 (file not found) and 500 (internal server error), but the status code we're looking for is good old 200 (ok), which means everything went well. If we get both a state of 4 and an HTTP status code of 200, we can go right ahead and start processing the response. Optionally, of course, we can attempt to handle any errors at this point, if, for example, the HTTP status code was something other than 200.

function processReqChange()
{
// only if req shows "complete"
if (req.readyState == 4) {
// only if "OK"
if (req.status == 200) {
// ...processing statements go here...
} else {
alert("There was a problem retrieving
the XML data:\n" + req.statusText);
}
}
}
In Practice

I'm going to work up a practical example so we can get this going. Most web applications have some method of signing up users, and it's common to ask the user to pick a username to use for the site. Often, these need to be unique, and so a check is made against the database to see if any other user already has the username a new recruit is trying to sign up with. If you've ever signed up for a web mail account, you'll know how infuriating it is cycling around the process trying to find a username that isn't already taken. It would be really helpful if that check could be made without the user leaving the page.

The solution will involve four key elements: an XHTML form, a JavaScript function for handling the specifics of this case, our pair of generic functions (as above) for dealing with HTTP, and finally, a script on the server to search the database.

The Form

Here's the easy bit--a simple form field to collect the user's chosen username. An onblur event handler is used to fire the script. In order to display a friendly message to the user if the name is taken, I've embedded it in the form and hidden it with CSS. This should prove a little less violent than a standard JavaScript alert box.

onblur="checkName(this.value,'')" />

The CSS defines a class for hidden and also one for showing the error. Call that one error.

span.hidden{
display: none;
}

span.error{
display: inline;
color: black;
background-color: pink;
}
Handling the Input

The checkName function is used to handle the input from our form. Its job is to collect the input, decide which script on the server to present it to, invoke the HTTP functions to do the dirty work on its behalf, and then deal with the response. As such, this function has to operate in two modes. One mode receives input from the form, the other the response from the HTTP request. I'll explain the reason for this in the next section.

function checkName(input, response)
{
if (response != ''){
// Response mode
message = document.getElementById('nameCheckFailed');
if (response == '1'){
message.className = 'error';
}else{
message.className = 'hidden';
}
}else{
// Input mode
url =
'http://localhost/xml/checkUserName.php?q=' + input;
loadXMLDoc(url);
}

}
Our response is going to be easy to deal with--it'll be a string of either 1 or 0, with 1 indicating that the name is in use. Therefore, the function changes the class name of the error message so it gets displayed or hidden, depending. As you can see, the dirty work at the server is being done by a script called checkUserName.php.

HTTP Heavy Lifting

As we saw earlier, the HTTP work is being done by two functions, loadXMLDoc and processReqChange. The former can remain totally as-is for the purposes of this example, with the only modifications needed to the latter being a quick bit of DOM work.

You'll recall that by the time a successful response has been passed to processReqChange, we're no long in a position to pass any sort of return value back up the chain. Because of this, it's going to be necessary to make an explicit function call to another bit of code in order to do anything useful with the response. This is why our checkName function has to run in two modes. Therefore, the main job of processReqChange is to parse the XML coming back from the server and pass the raw values back to checkName.

However, it is important that we keep these functions generic (we may have multiple items on the page that need to make use of XMLHttpRequest), and so hard-coding a reference to checkName at this point would be foolhardy. Instead, a better design is to have the server indicate the handling function as part of its response.

standalone="yes"?>

checkName
1

Parsing such a simple response should be no problem at all.

function processReqChange()
{
// only if req shows "complete"
if (req.readyState == 4) {
// only if "OK"
if (req.status == 200) {
// ...processing statements go here...
response = req.responseXML.documentElement;

method =
response.getElementsByTagName('method')[0].firstChild.data;

result =
response.getElementsByTagName('result')[0].firstChild.data;

eval(method + '(\'\', result)');
} else {
alert("There was a problem retrieving the XML
data:\n" + req.statusText);
}
}
}
By using the responseXML property of the XMLHttpRequest object, we have a ready-made XML object we can traverse with the DOM. By grabbing content of the method element, we know which local function to execute along with the result. Once you've finished testing, it's probably a good idea to dump the else clause from the above code, enabling the function to fail silently.

The Server Script

The final piece in our jigsaw is the script on the server to accept the request, process it, and return an XML document in response. For the purposes of our example, this script looks up usernames in a database table to determine whether a name is already in use. For brevity, my example PHP script below just checks against two hard-coded names, 'Drew' and 'Fred'.

header('Content-Type: text/xml');

function nameInUse($q)
{
if (isset($q)){
switch(strtolower($q))
{
case 'drew' :
return '1';
break;
case 'fred' :
return '1';
break;
default:
return '0';
}
}else{
return '0';
}

}
?>
standalone="yes"?>'; ?>

checkName
echo nameInUse($_GET['q']) ?>


Of course, the logic used to verify the availability of the username in this script can be reused after the form is submitted to recheck that the name is available. This is an important step, since if JavaScript was not available at the client, this check would not have yet taken place. Additionally, on a busy site, a username which checked out OK at the time the user was filling the form in may have been taken by the time the form is submitted.

Perhaps as a next step, if you're interested in playing with this some more, you could add the ability for the server to return a list of suggested alternative usernames if the suggested name is taken.

Extensible Stylesheet Language (XSL)

1 Introduction and Overview

This specification defines the Extensible Stylesheet Language (XSL). XSL is a language for expressing stylesheets. Given a class of arbitrarily structured XML [XML] or [XML 1.1] documents or data files, designers use an XSL stylesheet to express their intentions about how that structured content should be presented; that is, how the source content should be styled, laid out, and paginated onto some presentation medium, such as a window in a Web browser or a hand-held device, or a set of physical pages in a catalog, report, pamphlet, or book.

1.1 Processing a Stylesheet

An XSL stylesheet processor accepts a document or data in XML and an XSL stylesheet and produces the presentation of that XML source content that was intended by the designer of that stylesheet. There are two aspects of this presentation process: first, constructing a result tree from the XML source tree and second, interpreting the result tree to produce formatted results suitable for presentation on a display, on paper, in speech, or onto other media. The first aspect is called tree transformation and the second is called formatting. The process of formatting is performed by the formatter. This formatter may simply be a rendering engine inside a browser.

Tree transformation allows the structure of the result tree to be significantly different from the structure of the source tree. For example, one could add a table-of-contents as a filtered selection of an original source document, or one could rearrange source data into a sorted tabular presentation. In constructing the result tree, the tree transformation process also adds the information necessary to format that result tree.

Formatting is enabled by including formatting semantics in the result tree. Formatting semantics are expressed in terms of a catalog of classes of formatting objects. The nodes of the result tree are formatting objects. The classes of formatting objects denote typographic abstractions such as page, paragraph, table, and so forth. Finer control over the presentation of these abstractions is provided by a set of formatting properties, such as those controlling indents, word- and letter spacing, and widow, orphan, and hyphenation control. In XSL, the classes of formatting objects and formatting properties provide the vocabulary for expressing presentation intent.

The XSL processing model is intended to be conceptual only. An implementation is not mandated to provide these as separate processes. Furthermore, implementations are free to process the source document in any way that produces the same result as if it were processed using the conceptual XSL processing model. A diagram depicting the detailed conceptual model is shown below.

[D]

XSL Two Processes: Transformation & Formatting

1.1.1 Tree Transformations

Tree transformation constructs the result tree. In XSL, this tree is called the element and attribute tree, with objects primarily in the "formatting object" namespace. In this tree, a formatting object is represented as an XML element, with the properties represented by a set of XML attribute-value pairs. The content of the formatting object is the content of the XML element. Tree transformation is defined in the XSLT Recommendation [XSLT]. A diagram depicting this conceptual process is shown below.

[D]

Transform to Another Vocabulary

The XSL stylesheet is used in tree transformation. A stylesheet contains a set of tree construction rules. The tree construction rules have two parts: a pattern that is matched against elements in the source tree and a template that constructs a portion of the result tree. This allows a stylesheet to be applicable to a wide class of documents that have similar source tree structures.

In some implementations of XSL/XSLT, the result of tree construction can be output as an XML document. This would allow an XML document which contains formatting objects and formatting properties to be output. This capability is neither necessary for an XSL processor nor is it encouraged. There are, however, cases where this is important, such as a server preparing input for a known client; for example, the way that a WAP (http://www.wapforum.org/faqs/index.htm) server prepares specialized input for a WAP capable hand held device. To preserve accessibility, designers of Web systems should not develop architectures that require (or use) the transmission of documents containing formatting objects and properties unless either the transmitter knows that the client can accept formatting objects and properties or the transmitted document contains a reference to the source document(s) used in the construction of the document with the formatting objects and properties.

1.1.2 Formatting

Formatting interprets the result tree in its formatting object tree form to produce the presentation intended by the designer of the stylesheet from which the XML element and attribute tree in the "fo" namespace was constructed.

The vocabulary of formatting objects supported by XSL - the set of fo: element types - represents the set of typographic abstractions available to the designer. Semantically, each formatting object represents a specification for a part of the pagination, layout, and styling information that will be applied to the content of that formatting object as a result of formatting the whole result tree. Each formatting object class represents a particular kind of formatting behavior. For example, the block formatting object class represents the breaking of the content of a paragraph into lines. Other parts of the specification may come from other formatting objects; for example, the formatting of a paragraph (block formatting object) depends on both the specification of properties on the block formatting object and the specification of the layout structure into which the block is placed by the formatter.

The properties associated with an instance of a formatting object control the formatting of that object. Some of the properties, for example "color", directly specify the formatted result. Other properties, for example 'space-before', only constrain the set of possible formatted results without specifying any particular formatted result. The formatter may make choices among other possible considerations such as esthetics.

Formatting consists of the generation of a tree of geometric areas, called the area tree. The geometric areas are positioned on a sequence of one or more pages (a browser typically uses a single page). Each geometric area has a position on the page, a specification of what to display in that area and may have a background, padding, and borders. For example, formatting a single character generates an area sufficiently large enough to hold the glyph that is used to present the character visually and the glyph is what is displayed in this area. These areas may be nested. For example, the glyph may be positioned within a line, within a block, within a page.

Rendering takes the area tree, the abstract model of the presentation (in terms of pages and their collections of areas), and causes a presentation to appear on the relevant medium, such as a browser window on a computer display screen or sheets of paper. The semantics of rendering are not described in detail in this specification.

The first step in formatting is to "objectify" the element and attribute tree obtained via an XSLT transformation. Objectifying the tree basically consists of turning the elements in the tree into formatting object nodes and the attributes into property specifications. The result of this step is the formatting object tree.

[D]

Build the XSL Formatting Object Tree

As part of the step of objectifying, the characters that occur in the result tree are replaced by fo:character nodes. Characters in text nodes which consist solely of white space characters and which are children of elements whose corresponding formatting objects do not permit fo:character nodes as children are ignored. Other characters within elements whose corresponding formatting objects do not permit fo:character nodes as children are errors.

The content of the fo:instream-foreign-object is not objectified; instead the object representing the fo:instream-foreign-object element points to the appropriate node in the element and attribute tree. Similarly any non-XSL namespace child element of fo:declarations is not objectified; instead the object representing the fo:declarations element points to the appropriate node in the element and attribute tree.

The second phase in formatting is to refine the formatting object tree to produce the refined formatting object tree. The refinement process handles the mapping from properties to traits. This consists of: (1) shorthand expansion into individual properties, (2) mapping of corresponding properties, (3) determining computed values (may include expression evaluation), (4) handling white-space-treatment and linefeed-treatment property effects, and (5) inheritance. Details on refinement are found in 5 Property Refinement / Resolution.

The refinement step is depicted in the diagram below.

[D]

XML

Technical Introduction to XML 


 It is somewhat remarkable to think that this article, which appeared initially in the Winter 1997 edition of the World Wide Web Journal was out of date by the time the final XML Recommendation was approved in February. And even as this update brings the article back into line with the final spec, a new series of recommendations are under development. When finished, these will bring namespaces, linking, schemas, stylesheets, and more to the table.

This introduction to XML presents the Extensible Markup Language at a reasonably technical level for anyone interested in learning more about structured documents. In addition to covering the XML 1.0 Specification, this article outlines related XML specifications, which are evolving. The article is organized in four main sections plus an appendix.


What Do XML Documents Look Like?

If you are conversant with HTML or SGML, XML documents will look familiar. A simple XML document is presented in Example 1.

Example 1. A Simple XML Document

    Say goodnight, Gracie.  Goodnight,  Gracie.    

A few things may stand out to you:

  • The document begins with a processing instruction: . This is the XML declaration [Section 2.8]. While it is not required, its presence explicitly identifies the document as an XML document and indicates the version of XML to which it was authored.
  • There's no document type declaration. Unlike SGML, XML does not require a document type declaration. However, a document type declaration can be supplied, and some documents will require one in order to be understood unambiguously.
  • Empty elements ( in this example) have a modified syntax [Section 3.1]. While most elements in a document are wrappers around some content, empty elements are simply markers where something occurs (a horizontal rule for HTML's 
     tag, for example, or a cross reference for DocBook's  tag). The trailing /> in the modified syntax indicates to a program processing the XML document that the element is empty and no matching end-tag should be sought. Since XML documents do not require a document type declaration, without this clue it could be impossible for an XML parser to determine which tags were intentionally empty and which had been left empty by mistake.
    XML has softened the distinction between elements which are declared as EMPTY and elements which merely have no content. In XML, it is legal to use the empty-element tag syntax in either case. It's also legal to use a start-tag/end-tag pair for empty elements:. If interoperability is of any concern, it's best to reserve empty-element tag syntax for elements which are declared as EMPTY and to only use the empty-element tag form for those elements.

XML documents are composed of markup and content. There are six kinds of markup that can occur in an XML document: elements, entity references, comments, processing instructions, marked sections, and document type declarations. The following sections introduce each of these markup concepts.

Elements

Elements are the most common form of markup. Delimited by angle brackets, most elements identify the nature of the content they surround. Some elements may be empty, as seen above, in which case they have no content. If an element is not empty, it begins with a start-tag, , and ends with an end-tag, .

Attributes

Attributes are name-value pairs that occur inside start-tags after the element name. For example,

is a div element with the attribute class having the value preface. In XML, all attribute values must be quoted.

Entity References

In order to introduce markup into a document, some characters have been reserved to identify the start of markup. The left angle bracket, < , for instance, identifies the beginning of an element start- or end-tag. In order to insert these characters into your document as content, there must be an alternative way to represent them. In XML, entities are used to represent these special characters. Entities are also used to refer to often repeated or varying text and to include the content of external files.

Every entity must have a unique name. Defining your own entity names is discussed in the section on entity declarations. In order to use an entity, you simply reference it by name. Entity references begin with the ampersand and end with a semicolon.

For example, the lt entity inserts a literal <> can be represented in an XML document as <element>.

A special form of entity reference, called a character reference [Section 4.1], can be used to insert arbitrary Unicode characters into your document. This is a mechanism for inserting characters that cannot be typed directly on your keyboard.

Character references take one of two forms: decimal references, , and hexadecimal references, . Both of these refer to character number U+211E from Unicode (which is the standard Rx prescription symbol, in case you were wondering).

Comments

Comments begin with . Comments can contain any data except the literal string --. You can place comments between markup anywhere in your document.

Comments are not part of the textual content of an XML document. An XML processor is not required to pass them along to an application.

Processing Instructions

Processing instructions (PIs) are an escape hatch to provide information to an application. Like comments, they are not textually part of the XML document, but the XML processor is required to pass them to an application.

Processing instructions have the form: . The name, called the PI target, identifies the PI to the application. Applications should process only the targets they recognize and ignore all other PIs. Any data that follows the PI target is optional, it is for the application that recognizes the target. The names used in PIs may be declared as notations in order to formally identify them.

PI names beginning with xml are reserved for XML standardization.

CDATA Sections

In a document, a CDATA section instructs the parser to ignore most markup characters.

Consider a source code listing in an XML document. It might contain characters that the XML parser would ordinarily recognize as markup (< and &, for example). In order to prevent this, a CDATA section can be used.

  

Between the start of the section,  and the end of the section, ]]>, all character data is passed directly to the application, without interpretation. Elements, entity references, comments, and processing instructions are all unrecognized and the characters that comprise them are passed literally to the application.

The only string that cannot occur in a CDATA section is ]]>.

Document Type Declarations

A large percentage of the XML specification deals with various sorts of declarations that are allowed in XML. If you have experience with SGML, you will recognize these declarations from SGML DTDs (Document Type Definitions). If you have never seen them before, their significance may not be immediately obvious.

One of the greatest strengths of XML is that it allows you to create your own tag names. But for any given application, it is probably not meaningful for tags to occur in a completely arbitrary order. Consider the old joke example introduced earlier. Would this be meaningful?

Goodnight, Gracie  Say goodnight, Gracie.   

It's so far outside the bounds of what we normally expect that it's nonsensical. It just doesn'tmean anything.

However, from a strictly syntactic point of view, there's nothing wrong with that XML document. So, if the document is to have meaning, and certainly if you're writing a stylesheet or application to process it, there must be some constraint on the sequence and nesting of tags. Declarations are where these constraints can be expressed.

More generally, declarations allow a document to communicate meta-information to the parser about its content. Meta-information includes the allowed sequence and nesting of tags, attribute values and their types and defaults, the names of external files that may be referenced and whether or not they contain XML, the formats of some external (non-XML) data that may be referenced, and the entities that may be encountered.

There are four kinds of declarations in XML: element type declarations, attribute list declarations, entity declarations, and notation declarations.

Element Type Declarations

Element type declarations [Section 3.2] identify the names of elements and the nature of their content. A typical element type declaration looks like this:

  

This declaration identifies the element named oldjoke. Its content model follows the element name. The content model defines what an element may contain. In this case, an oldjoke must contain burns and allen and may contain applause. The commas between element names indicate that they must occur in succession. The plus after burns indicates that it may be repeated more than once but must occur at least once. The question mark after applauseindicates that it is optional (it may be absent, or it may occur exactly once). A name with no punctuation, such as allen, must occur exactly once.

Declarations for burnsallenapplause and all other elements used in any content model must also be present for an XML processor to check the validity of a document.

In addition to element names, the special symbol #PCDATA is reserved to indicate character data. The moniker PCDATA stands for parseable character data .

Elements that contain only other elements are said to have element content [Section 3.2.1]. Elements that contain both other elements and #PCDATA are said to have mixed content[Section 3.2.2].

For example, the definition for burns might be

The vertical bar indicates an or relationship, the asterisk indicates that the content is optional (may occur zero or more times); therefore, by this definition, burns may contain zero or more characters and quote tags, mixed in any order. All mixed content models must have this form:#PCDATA must come first, all of the elements must be separated by vertical bars, and the entire group must be optional.

Two other content models are possible: EMPTY indicates that the element has no content (and consequently no end-tag), and ANY indicates that any content is allowed. The ANY content model is sometimes useful during document conversion, but should be avoided at almost any cost in a production environment because it disables all content checking in that element.


Visual C++

Introduction to MFC

Introduction to Applications

The Microsoft Foundation Class (MFC) library is a set of data types, functions, classes, and constants used to create applications for the Microsoft Windows family of operating systems.

The first thing you should do to start a program is to create an application. In Win32, an application is created by a call to the WinMain() function and building a WNDCLASS or WNDCLASSEX structure. In MFC, this process has been resumed in a class called CWinApp (Class-For-A-Windows-Application). Based on this, to create an application, you must derive your own class from CWinApp.

An application by itself is an empty thing that only lets the operating system know that you are creating a program that will execute on the computer. It doesn't display anything on the screen. If you want to display something, the CWinApp class provides the InitApplication() method that you must override in your class. InitApplication() is a Boolean method. If it succeeds in creating the application, it returns TRUE. If something went wrong when trying to create the application, it would return FALSE. The minimum skeleton of an application would appear as follows:

class CExerciseApp : public CWinApp { public:     virtual BOOL InitInstance(); };  BOOL CExerciseApp::InitInstance() {     return TRUE; } 

After creating the application, to make it available to other parts of the program, you must declare a global variable of your class. It is usually called theAppbut you can call it anything you want.

The fundamental classes of MFC are declared in the afxwin.h header file. Therefore, this is the primary header you may have to add to each one of your applications. Based on this, a basic application can be created as follows:

#include <AFXWIN.H>   class CExerciseApp : public CWinApp { public:     virtual BOOL InitInstance(); };  BOOL CExerciseApp::InitInstance() {     return TRUE; }  CExerciseApp theApp; 

Introduction to Frames

As its name implies, a frame of a window includes the borders, the location, and the dimensions of a window. There are two types of MFC applications: those that use a frame and those that don't. A frame-based application uses a concept known as the Document/View Architecture. This allows the frame to serve as a place holder for other parts of an application (such as the document and the view).

To create a frame, the MFC library provides various classes. One of these is called CFrameWnd and it is the most commonly used frame class. To use a frame, you can derive your own class from CFrameWnd as follows:

class CApplicationFrame : public CFrameWnd { }; 

Because there can be many frames or various types of frames in an application, the first or main frame is usually called CMainFrame, we will follow the same habit but you can call your frame class anything you want:

class CMainFrame : public CFrameWnd { }; 

The skeleton of this frame only serves as a foundation for your class. You must actually create a window frame that would display to the user. To create a window frame, the CFrameWnd class provides the Create() method. Its syntax is:

BOOL Create(LPCTSTR lpszClassName, LPCTSTR lpszWindowName,        DWORD dwStyle = WS_OVERLAPPEDWINDOW, const RECT& rect = rectDefault,        CWnd* pParentWnd = NULL, LPCTSTR lpszMenuName = NULL, DWORD dwExStyle = 0,        CCreateContext* pContext = NULL ); 

As you can see, the only two required arguments are the class name and the window name. We will come back to all these arguments when we study window classes in more detail. For now, a minimum frame can be created by simply passing the class name as NULL and the window name with a null-terminated string. Here is an example:

class CMainFrame : public CFrameWnd { public:     CMainFrame(); };  CMainFrame::CMainFrame() {     Create(NULL, "MFC Fundamentals"); } 

In order to provide a window to the application, you must create a thread. This would be done using the CWinThread class. To make this a little easy,CWinThread is equipped with a public member variable called m_pMainWnd. This variable can be used to create a thread for the main window of the application. One of its advantages is that it makes sure that your application terminates smoothly when the user decides to close it. CWinThread is the base class of CWinApp and therefore makes m_pMainWnd available to any CWinThread derived class such as CFrameWnd. Based on this, to create a thread for the main window to display, you can assign a pointer of your frame class to m_pMainWnd. After this assignment, m_pMainWnd can be used as the window object to display the frame, which is usually done by calling the ShowWindow() method. This would be done as follows:

BOOL CExerciseApp::InitInstance() {     m_pMainWnd = new CMainFrame;     m_pMainWnd->ShowWindow(SW_NORMAL);      return TRUE; } 

This application was created as follows:

  1. Start Microsoft Visual C++ or Visual Studio.
  2. On the main menu, click either File -> New... or File -> New Project...
  3. In the New dialog box, click Projects or, in the New Project dialog box, click Visual C++ Projects.
  4. Click either Win32 Application or Win32 Project.
  5. Type a name for the application in the Name edit box. An example would be MFCFundamentals1.
  6. Click OK.
  7. Specify that you want to create a Windows Application as an Empty Project and click Finish.
  8. To use MFC in MSVC 6, click Project -> Settings... In MSVC 7, in the Solutions Explorer property page, right-click the project name (MFCFundamentals1) and click Properties.
  9. In the Microsoft Foundation Classes combo box or in the Use of MFC combo box, select Use MFC In A Shared DLL.
  10. Click OK.
  11. To add a file to create the application, on the main menu of MSVC 6, click File -> New... or, for MSVC 7, on the main menu, click Project -> Add New Item...
  12. Click C++ (Source) File and, in the Name edit box, type a name for the file. An example would be Exercise.
  13. Click OK or Open.
  14. In the empty file, type the above code:
    #include <AFXWIN.H>   class CExerciseApp : public CWinApp { public:   virtual BOOL InitInstance(); };  class CMainFrame : public CFrameWnd { public:   CMainFrame(); };  CMainFrame::CMainFrame() {   Create(NULL, "MFC Fundamentals"); }  BOOL CExerciseApp::InitInstance() {   m_pMainWnd = new CMainFrame;   m_pMainWnd->ShowWindow(SW_NORMAL);    return TRUE; }  CExerciseApp theApp;
  15. Execute the application.
  16. Close it and return to MSVC.