SQL Dialects

Many database vendors provide their own dialect of SQL to supplement the standard.  For example, conditional processing, error handling, and flow-of-control statements are often provided by database vendors as an element of their own dialect of the SQL standard.

The ANSI SQL standard SQL/PSM (Persistent Stored Module) package describes many features associated with programming stored procedures and incorporates many of the extensions offered by these dialects.  However, the implementation of SQL/PSM is not consistent or reliable across all database vendors.

Some popular dialects of SQL include:

PL/SQL
Found in Oracle. PL/SQL stands for Procedural Language/SQL and contains many similarities to the language Ada.

Transact-SQL
Used by both Microsoft SQL Server and Sybase Adaptive Server. As Microsoft and Sybase have moved away from the common platform they shared early in the 1990s, their implementations of Transact-SQL have also diverged.

PL/pgSQL
The SQL dialect and extensions implemented in PostgreSQL. The acronym stands for Procedural Language/postgreSQL.

SQL/PL
The procedural language extensions found in IBM’s DB2 UDB database platform.
Google Author

SQL History and Implementation

In the early 1970′s, the seminal work of IBM research fellow Dr. E. F. Codd led to the development of a relational database product called SEQUEL, or Structured English Query Language. SEQUEL due to some legal hurdles was changed to ultimately SQL, or the Structured Query Language.

IBM, along with other database vendors, wanted a standardized method for accessing and manipulating data in a relational database. Although IBM was the first to develop relational database theory, Oracle is widely credited with being first to market and popularize the technology.

Over time, SQL proved popular enough in the marketplace to attract the attention of the American National Standards Institute (ANSI).  The standards body released standards for SQL in 1986, 1989, 1992, 1999, 2003, and 2006. The ANSI 2003 release is the most popular version of the standard.  More recent releases, such as  2006 standard, deal with specific sub-elements of SQL, many times outside of the scope of the normal relational database behavior.  For example, the SQL2006 standard describes how XML would be used in SQL.

Since 1986, various competing languages have allowed programmers and developers to access and to manipulate relational data. However, few were as easy to learn nor as universally accepted as SQL.  In fact as new, so-called NoSQL database platforms emerge, many quickly implement SQL-like dialects to ride the widespread momentum of global community of users.  A good example of this is PIG, the SQL-like dialect used by Hadoop/HBase databases.

Programmers and administrators now have the benefit of being able to learn a single language that, with minor adjustments, is applicable to a wide variety of database platforms, applications, and products.

 

 

Introduction to the Relational Model and ANSI SQL

Relational database management systems (RDBMS) such as those covered on this website are the principal engines of structured data systems worldwide. They enable a multitude of users to quickly and simultaneously access, create, edit, and manipulate data without impacting other users. And they enable developers to write useful applications to access their data while providing administrators with the consistent and reliable capabilities they need to maintain, secure, and optimize organizational data resources.
An RDBMS may be defined as a system whose users view data as a collection of tables related to each other through common data values. Data is stored in tables, which are composed of rows and columns. Tables of independent data can be linked (or related) to one another if they each have unique, identifying columns of data (called keys) that represent data values held in common between the tables.
E. F. Codd first described relational database theory in his landmark paper “A Relational Model of Data for Large Shared Data Banks,” published in the Communications of the ACM (Association for Computing Machinery) in June, 1970. Under Codd’s new relational data model, data was structured (into tables of rows and columns); manageable using operations such as selections, projections, and joins; and consistent as the result of integrity rules such as keys and referential integrity.
Codd also articulated rules that governed how a relational database should be designed. The process for applying these rules is now known as normalization.

Google Author

ALL Operator

The ALL operator returns True when all of the records in query result set pass the Boolean condition.

A bit more explanation.  The ALL statement gives a TRUE value when: 1) the subquery returns no records, or 2) all records meet the comparison.  ALL will return false if even one record fails the Boolean condition.  If any of the values are NULL, then the operation value will be NULL, not TRUE.

For example, we want to see which authors currently have no titles:

SELECT au_id
FROM authors
WHERE au_id <> ALL(SELECT au_id FROM titleauthor)

Syntax

SELECT …
WHERE expression comparison {ALL | ANY | SOME} ( subquery )

Explained

WHERE expression

Tests a single value, such as a column, against all other values in the subquery. All rows need to match the expression to return a Boolean TRUE value for the ALL operator to return a Boolean TRUE value.

comparison

Compares the expression to the subquery.  The comparison must be a standard comparison operator like =, <>, !=, >, >=, <, or <=.

Advice

ALL is unique in that it guarantees that all records in the result set evaluate to TRUE.

DBMS Variations

All the major commercial database platforms support ALL.

Oracle supports a minor variation in that you can supply a list of values instead of a subquery.  For example, we can find all employees who have a job_lvl value equal to 9 or 14:

SELECT * FROM employee
WHERE job_lvl = ALL(9, 14);

DB2 and SQL Server supports some additional comparison operators with ALL, including not greater than (!>) and not less than (!<).

Related Entries

BETWEEN

EXISTS

IN

Operators

SELECT

SOME

WHERE

Google Author

Codd’s Rules for Relational Database Systems

E.F. Codd applied rigorous mathematical theories (primarily set theory and relational algebra) to the management of data, and he compiled a list of criteria a database must meet to be considered relational. At its core, the relational database concept centers around storing data in tables. This concept is now so common as to seem trivial; however, not long ago the goal of designing a system capable of sustaining the relational model was considered a long shot with limited usefulness.

Following are Codd’s Twelve Principles of Relational Databases:

1.     Information is represented logically in tables.
2.     Data must be logically accessible by table, primary key, and column.
3.     Null values must be uniformly treated as “missing information,” not as empty strings, blanks, or zeros.
4.     Metadata (data about the database) must be stored in the database just as regular data is.
5.     A single language must be able to define data, views, integrity constraints, authorization, transactions, and data manipulation.
6.     Views must show the updates of their base tables and vice versa.
7.     A single operation must be available to do each of the following operations: retrieve data, insert data, update data, or delete data.
8.     Batch and end-user operations are logically separate from physical storage and access methods.
9.     Batch and end-user operations can change the database schema without having to recreate it or the applications built upon it.
10.     Integrity constraints must be available and stored in the metadata, not in an application program.
11.     The data manipulation language of the relational system should not care where or how the physical data is distributed and should not require alteration if the physical data is centralized or distributed.
12.     Any row processing done in the system must obey the same integrity rules and constraints that set-processing operations do.
These principles continue to be the litmus test used to validate the “relational” characteristics of a database platform; a database that does not meet all of these rules is not fully relational. While these rules do not apply to applications development, they do determine whether the database engine itself can be considered truly “relational.” Currently, most commercial RDBMS products pass Codd’s test.
Understanding Codd’s principles assists programmers and developers in the proper development and design of relational databases.

Google Author

ANY / SOME operators

The ANY and SOME operators, which are synonyms, perform a Boolean test returning true if even one record in the result set matches the Boolean condition. ANY/SOME is semantically equivalent to IN and EXISTS, but much less widespread.  To ensure easy maintenance of your code, use IN or EXISTS rather than ANY/SOME wherever possible.

A bit more explanation,  ANY/SOME operators return TRUE when at least one record matches and return FALSE when none match or when there’s an empty result set.  If any of the values are NULL, then the operation value will be NULL, not TRUE.

ANY/SOME can be used for filtering checks similar to an IN or EXISTS subquery. In this example, we will get any employee records that also exist in the employees_archive table with a city of Anchorage.

SELECT *
FROM employees
WHERE job_lvl = ANY(SELECT job_lvl FROM employees_archive
                    WHERE city = ‘Anchorage’)

 

Syntax

SELECT …
WHERE expression comparison {ANY | SOME} ( subquery )

Explained

WHERE expression

Tests a single value, such as a column, against every value until a match is found for ANY and SOME.  One or more rows must match the expression, for ANY and SOME, to return a Boolean TRUE value.

comparison

Compares the expression to the subquery.  The comparison must be a standard comparison operator like =, <>, !=, >, >=, <, or <=.

Related Entries

BETWEEN

EXISTS

IN

Operators

SELECT

SOME

WHERE

Google Author

SQL Statement Classes

Long-standing classes for SQL statements originated back in the days of the SQL92 standard.  These classifications are still common:

Data Manipulation Language (DML): SQL statements which manipulate data, such as SELECT, INSERT, UPDATE, DELETE and MERGE.

Data Definition Language (DDL): SQL statements control the creation, modification, and removal of database objects, such as CREATE, ALTER and DROP.

Data Control Language (DCL): SQL statements related to permissions to execute and access database objects and data, such as GRANT and REVOKE.

 Starting with SQL3, seven core categories define the elements of the SQL standard, as shown in the following table:

Category Description Examples
SQL connection statements Starts and ends a client connection CONNECT, DISCONNECT
SQL control statements Controls the execution of a set of SQL statements CALL, RETURN
SQL data statements  Manipulates or alters data SELECT, INSERT, UPDATE, DELETE, MERGE
SQL diagnostic statements Provides diagnostic information, exceptions and errors GET DIAGNOSTICS  
SQL schema statements  Manipulates or alters database schemas and objects within that schema ALTER, CREATE, DROP
SQL session statements  Controls default behavior for a user session SET
SQL transaction statements  Sets the start and end point of a transaction COMMIT, ROLLBACK


If you work regularly with SQL, then you should become familiar with both the old (SQL92) and the new (SQL3) statement classes, since both nomenclatures are frequently used to refer to SQL features and statements.

Google Author

What Is a Relational Database?

Relational?!? Move On, Geezer!

Maybe you’re thinking that relational databases management systems (RDBMSs), like Microsoft SQL Server and Oracle, are going the way of punched cards and rotary phones.  After all, there’s been a lot of hype these days in the IT media about the rise of so-called NoSQL (Not Only SQL) databases.  Many new and upcoming CS and MIS graduates who like working with data might think that relational databases are, at best, soon-to-be legacy systems and, at worst, are a career dead-end.

Wrong!!!

It’s true that all the cool-cat computing services (Amazon, Facebook, Google, Pinterest, etc) are indeed making heavy use of NoSQL technology.  They’re also making heavy use of traditional RDBMS’es too.  In fact, some of the world’s biggest users of SQL databases are hand-in-hand the biggest users of NoSQL databases.  The reason for that is that both types of data platforms are exceeding good at specific types of data storage and data processing.  They also have their own unique weaknesses too.  Meaning, each platform has a sweet spot and a weak spot, and that none are a 100% panacea for all imaginable data processing scenarios.  Take a look at this article by my friend and former colleague, Guy Harrison - 10 Things You Should Know About NoSQL Databases, for a good discussion on the pros and cons of NoSQL in comparison to SQL data platforms.

Timewarp! Let’s Take a Look Back at Why Relational Databases Were Needed.

These days, relational database management systems (RDBMSs) like Microsoft SQL Server and Oracle are the primary engines of information systems everywhere, particularly for enterprise computing systems and web applications. Though RDBMSs are now common enough to trip over, it wasn’t always that way. Not too long ago, you would probably trip over hierarchical database systems, or network database systems, or flat-file systems (heck, that still happens in many government IT shops who still use COBOL).  A quick-and-dirty definition for a relation database might be: a system whose users view data as a collection of tables related to each other through common data values.

Perhaps you are interested in more than a quick-and-dirty definition for the term relational database?  Here goes.  The whole basis for the relational model follows this train of thought: data is stored in tables, which are composed of rows and columns.  Tables of independent data can be linked, or related, to one another if they each have columns of data that represent the same data value, called keys.  This concept is so common as to seem trivial; however, it was not so long ago that achieving and programming a system capable of sustaining the relational model was considered a longshot with limited usefulness.

Relational data theory was first proposed by E.F. Codd in his 1970 paper to the ACM entitled “A Relational Model of Data for Large Shared Data Banks”.  Soon after, Codd clarified his position in the 1974 paper to the Texas Conference on Computing Systems entitled “The Relational Approach to Data Base Management: An Overview”.  It was in this paper that Codd proposed the now legendary 12 Principles of Relational Databases.

If a vendor’s database product didn’t meet Codd’s 12 item litmus tests, then it was not a member of the club.  Note that the rules do not apply to applications development.  Instead, these rules determine whether the database engine itself can be considered truly “relational”.  These rules were constructed to support a data model that would ensure the ACID properties of transactions and also eliminate a variety of data manipulation anomalies that frequently occurred on non-relation database platforms (and  still do occur on non-relational database platforms). (As an aside, the transactional paradigm was conceived by my hero, Gray, Jim in 1981 while at Tandem Computer and presented in the paper “The Transaction Concept: Virtues and Limitations“).

Codd’s 12 Rules for a Truly Relational Database System

Are you curious about Codd’s 12 Principles of Relational Databases? Don’t be ashamed that you don’t know them by heart; few technology professionals do, and no one on the marketing staff of technology companies do.  However, the few folks who do know these principles by heart treat them like religious doctrine, and would likely be mortified by their “lightweight” treatment here.  Nevertheless, I’ll give them to you in my own paraphrasing:

  1. Information is represented logically in tables.
  2. Data must be logically accessible by table, primary key, and column.
  3. Null values must be uniformly treated as “missing information” not as empty strings, blanks, or zeros.
  4. Metadata (data about the database) must be stored in the database just as regular data is.
  5. A single language must be able to define data, views, integrity constraints, authorization, transactions, and data manipulation.
  6. Views must show the updates of their base tables and vice versa.
  7. A single operation must be able to retrieve, insert, update, or delete data.
  8. Batch and end-user operations are logically separate from physical storage and access methods.
  9. Batch and end-user operations can change the database schema without having to recreate it or applications built upon it.
  10. Integrity constraints must be available and stored in the metadata, not in an application program.
  11. The data manipulation language of the relational system should not care where or how the physical data is distributed and should not require alteration if the physical data is centralized or distributed.
  12. Any row-processing done in the system must obey the same integrity rules and constraints that set-processing operations do.

If you know much about SQL, then you probably recognize immediately that SQL ended up fulfilling rules #5, #7, #11 and possibly more.  Others of the rule are manifest in the system tables of a relational database, such as DMVs in Microsoft SQL Server and V$ and X$ views in Oracle.

Relational Rises

There is some debate about why relational database systems won out over hierarchical and network database systems back in the late 1980′s and early 1990′s, but a couple of reasons seem self-evident.  First, the high-level language interface (SQL)  is much simpler to learn and more intuitive than that mishmash of languages supporting non-relational databases.  (In fact, the lack of something like SQL is a hindrance to adoption of many NoSQL database platforms).  Second, relational databases provide efficient and intuitive data structures that easily accommodate ad-hoc queries and reporting.  People just intuitively understand the value of storing data in tables.  From phone books to hotel registries, relational databases (of the paper sort) are second nature to most people. Third, relational databases provide powerful integrity controls such as check constraints and referential integrity – thus providing higher quality data.  And high quality data is near and dear to the heart of CFOs around the world.

In fact, the strength that relational databases demonstrate with data quality, consistency, and durability are the same reasons that they’ll be with us – quite possibly – forever.  So were NoSQL databases excel at storing data that is  moderately important and requires  eventual consistency, SQL database excel at storing data that is of  paramount importance and requires  immediate consistency.  As long as we’re exchanging money, there’s a need for relational database technology and ACID transactions.

And, just my opinion here, but database administration is currently, and will continue for decades to be, an excellent career choice.  Why?  First, although databases are widespread, good databases are not.  So there’s always need for those who can tune, troubleshoot, and optimize what is currently in the marketplace.  Second, just because database are widespread doesn’t mean that they’re everywhere they need to be.  Some estimates gauge that only half of the enterprises that need SQL databases actually  use SQL databases.  Imagine if only half of the citizenry wore shoes, and of the half that wore shoes, only half of them wore both shoes and consistently tied them.  It’d be a good time to be a maker of loafers!  Well, that’s where we’re at today with relational databases.

So what do you think?  Am I off the mark on the longevity of relational database?  Do you think the sun has set on them?  Will they be smashed, degraded, and humiliated by NoSQL database platforms?  Or will they stand shoulder-to-shoulder with a variety of data platforms in the years to come?