Andrew Clarke

Subscribe to Andrew Clarke feed
Notes from the Tooting Bec UndergroundAPC
Updated: 1 hour 43 min ago

The use and misuse of %TYPE and %ROWTYPE attributes in PL/SQL APIs

Fri, 2020-09-18 08:34
PL/SQL provides two attributes which allow us to declare a data structure with its datatype derived from a database table or a previously declared variable.

We can use %type attribute for defining a constant, a variable, a collection element, record field or PL/SQL program parameters. While we can reference a previously declared variable, the most common use case is to tie the declaration to a table column. The following snippet declares a variable with the same datatype and characteristics (length, scale, precision) as the SAL column of the EMP table.

l_salary emp.sal%type;
We can use the %rowtype attribute to declare a record variable which matches the projection of a database table or view, or a cursor variable. The following snippet declares a variable with the same projection as the preceeding cursor.

cursor get_emp_dets is
select emp.empno
, emp.ename
, emp.sal
, dept.dname
from emp
inner join dept on dept,deptno = emp.deptno;
l_emp_dets get_emp_dets%rowtype;
Using these attributes is considered good practice. PL/SQL development standards will often mandate their use. They deliver these benefits:
  1. self-documenting code: if we see a variable with a definition which references emp.sal%type we can be reasonably confident this variable will be used to store data from the SALARY column of the EMP table.
  2. datatype conformance: if we change the scale or precision of the the SALARY column of the EMP table all variables which use the %type attribute will pick up the change automatically. If we add a new column to the EMP table, all variables defined with the %rowtype attribute will be able to handle that column without us needing to change those programs.
That last point comes with an amber warning: the automatic conformance only works when the %rowtype variable is populated by SELECT * FROM queries. If we are using an explicit projection with named columns then we have now broken our code and we need to fix it. More generally, this silent propagation of changes to our data structures means we need to pay more attention to impact analysis. Is it right that we can just change a column's datatype or amend a table's projection without changing the code which depends on them? Maybe it's okay, maybe not. By shielding us from the immediate impact of broken code, using these attributes also withholds the necessity to revisit our programs: so we have to remember to do it.

Overall I think the benefits listed above outweigh the risks, and I think we should always use these attributes whenever it is appropriate, for the definition of local variables and constants. However, complications arise if we use them to declare PL/SQL program parameters, specifically for procedures in package specs and standalone program units. It's not so bad if we're writing an internal API but it becomes a proper headache when we are dealing with a public API, one which will be called by programs owned by another user, one whose developers are in another team or outside our organisation, or even using Java, dotNet or whatever. So why is the use of these attributes so bad for those people?

  1. obfuscated code: these attributes are only self-documenting when we have a familiarity with the underlying schema, or have easy access to it. This will frequently not be the case for developers in other teams (or outside the organisation) who need to call our API. They may be able to guess at the datatype of SALARY or HIREDATE, but they really shouldn't have to. And, of course, a reference to emp%rowtype is about as unhelpful as it could be. Particularly when we consider ...
  2. loss of encapsulation: one purpose of an API is to shield consumers of our application from the gnarly details of its implementation. However, the use of %type and %rowtype is actually exposing those details. Furthermore, a calling program cannot define their own variables using these attributes unless we grant them SELECT on the tables. Otherwise the declaration will hurl PLS-00201. This is particularly problematic for handling %rowtype, because we need to define a record variable which matches the row structure.
  3. breaking the contract: an interface is an agreement between the provider and the calling program. The API defines input criteria and in return guarantees outcomes. It forms a contract, which allows the consumer to write code against stable definitions. Automatically propagating changes in the underlying data structures to parameter definitions creates unstable dependencies. It is not simply that the use of %type and %rowtype attributes will cause the interface to change automatically, the issue is that there is no mechanism for signalling the change to an API's consumers. Interfaces demand stable dependencies: we must manage any changes to our schema in a way which ideally allows the consumers to continue to use the interface without needing to change their code, but at the very least tells them that the interface has changed.
Defining parameters for public APIsThe simplest solution is to use PL/SQL datatypes in procedural signatures. These seem straightforward. Anybody can look at this function and understand that input parameter is numeric and the returned value is a string.

function get_dept_manager (p_deptno in number) return varchar2;
So clear but not safe. How long is the returned string? The calling program needs to know, so it can define an appropriately sized variable to receive it. Likewise, in this call, how long is can a message be?

procedure log_message (p_text in varchar2);
Notoriously we cannot specify length, scale or precision for PL/SQL parameters. But the calling code and the called code will write values to concretely defined types. The interface needs to communicate those definitions. Fortunately PL/SQL offers a solution: subtypes. Here we have a substype which explicitly defines the datatype to be used for passing messages:

subtype st_message_text is varchar2(256);

procedure log_message (p_text in st_message_text);
Now the calling program knows the maximum permitted length of a message and can trim its value accordingly. (Incidentally, the parameter is still not constrained in the called program so we can pass a larger value to the log_message() procedure: the declared length is only enforced when we assign the parameter to something concrete such as a local variable.)

We can replace %rowtype definitions with explicit RECORD defintions. So a function which retrieves the employee records for a department will look something like this:

subtype st_deptno is number(2,0);

type r_emp is record(
empno number(4,0),
ename varchar2(10),
job varchar2(9),
mgr number(4,0),
hiredate date
sal number(7,2),
comm number(7,2),
deptno st_deptno

type t_emp is table of r_emp;

function get_dept_employees (p_deptno in st_deptno) return t_emp;
We do this for all our public functions.

subtype st_manager_name is varchar2(30);

function get_dept_manager (p_deptno in st_deptno) return st_manager_name;
Now the API clearly documents the datatypes which calling programs need to pass and which they will receive as output. Crucially, this approach offers stability: the datatype of a parameter cannot be changed invisibly, as any change must be implemented in a new version of the publicly available package specification. Inevitably this imposes a brake on our ability to change the API but we ought not to be changing public APIs frequently. Any such change should arise from either new knowledge about the requirements or a bug in the data model. Wherever possible we should try to handle bugs internally within the schema. But if we have to alter the signature of a procedure we need to communicate the change to our consumers as far ahead of time as possible. Ideally we should shield them from the need to change their code at all. One way to achieve that is Edition-Based Redefinition. Other ways would be to deploy the change with overloaded procedures or even using a different procedure name, and deprecate the old procedure. Occasionally we might have no choice but to apply the change and break the API: sometimes with public interfaces the best we can do is try to annoy the fewest number of people. Transitioning from a private to a public interface There is a difference between internal and public packages. When we have procedures which are intended for internal usage (i.e. only called by other programs in the same schema) we can define their parameters with %type and %rowtype attributes. We have access and - it is to be hoped! - familiarity with the schema's objects, so the datatype anchoring supports safer coding. But what happens when we have a package which we wrote as an internal package but now we need to expose its functionality to a wider audience? Should we re-write the spec to use subtypes instead?

No. The correct thing to do is to write a wrapper package which acts as a facade over the internal one, and grant EXECUTE privileges on the wrapper. The wrapper package will obviously have the requisite subtype definitions in the spec, and procedures declared with those subtypes. The package body will likely consist of nothing more than those procedures, which simply call their equivalents in the internal package. There may be some affordances for translating data structures, such as populating a table %rowtype variable from the public record type, but those will usually be necessary only for the purposes of documentation (this publicly defined subtype maps to this internally defined table column). There is an obvious overhead to writing another package, especially one which is really just a pass-through to the real functionality, but there are clear benefits which justify the overhead:

  • Stability. Not re-writing an existing package is always a good thing. Even if we are mechanically just replacing one set of datatype definitions with a different set which have the same characteristics we are still changing the core system, and that's a chunk of regression testing we've just added to the task.
  • Least privilege escalation. Even if the internal package has been written with a firm eye on the SOLID principles, the chances are it contains more functionality than we need to expose to other consumers. Writing a wrapper package gives us the opportunity to grant access to only the required procedures.
  • Composition. It is also likely that the internal package doesn't have the exact procedure the other team needs. Perhaps there are actually two procedures they need to call, or there's one procedure but it has some confusing internal flags in its signature. Instead of violating the Law of Demeter we can define one simple procedure in the wrapper package spec and handle the internal complexity in the body.
  • Future proofing. Writing a wrapper package gives us an affordance where we can handle subsequent changes in the internal data model or functionality without affecting other consumers. By definition a violation of YAGNI, but as it's not the main reason why we're doing this I'm allowing this as a benefit.
Design is always a trade offThe use of these attributes is an example of the nuances which Coding Standards often lack. In many situations their use is good practice, and we should employ them in those cases. But we also need to know when their use is a bad practice, and why, so we can do something better instead.

Part of the Designing PL/SQL Programs series

Ten character classes your project team needs

Wed, 2020-09-09 10:13
A dungeon-crawling party requires a good mix of character classes to be successful. If everyone is a wizard there's nobody who can fight off the orc warband. Similarly, a software development team needs a range of character traits and aspects to successfully deliver working software which meets the project's goals. Here's my take.

ScavengerThe Scavenger understands the importance of not re-inventing the wheel. To this end they acquire an encyclopaedic understanding of our languages' built-in libraries, the existing features of our system and other systems in the wider organisation, and open-source libraries.

Unless given a precise list of the project's wants a Scavenger will become a Mutant Renegade, scouring the post-atomic wasteland for useless relics, which are broken or undocumented or both.

Aspect: "Here's one we made earlier"
Traits: Focused laziness, Unfocused research, GitHub

Rat KingThe Rat King understands that software development is a communal task. Consequently they work to forge a collection of disparate individuals into a team. Their remit includes facilitating meetings and arranging after-work socials. Despite their fearsome appearance and collectivist instincts the Rat King is extremely sensitive to what each person brings to the party, and strives to ensure that introverts and teetotallers are included without feeling pressurized.

Unless met with a smidgeon of friendly scepticism a Rat King will become a Facebook.

Aspect: "We must hang together or we will surely hang separately"
Traits: Teamwork, Communications, Contacts

PaladinThe Paladin is the defender of the project but is also committed to the ideal of a project which is worth defending. They ensure everybody follows best practice, adheres to coding standards and observes the agile ceremonies.

Unless there's a Rogue to balance them a Paladin will become a Grand Inquisitor (although maybe without the thumbscrews).

Aspect: "Just do it right"
Traits: Rigour, Weird inner light

RogueThe Rogue is pragmatic where the Paladin is dogmatic. They have a swashbuckling approach to getting things done. They understand the concept of technical debt, they just tip the trade-off toward delivering stuff over following the rules. Very fond of observing that there's no such thing as "best practice".

Unless kept in line a Rogue will become a Cowboy. Yee-hah!

Aspect: "Let's do the show right here"
Traits: Resourcefulness, Acute bullshit detector, Cynefin

Mad Scientist The Mad Scientist has a deep technical understanding of software development, both practice and theory. They are obsessed with innovative and extremely clever solutions to business problems.

Unless your business problem actually requires an extremely clever solution a Mad Scientist will become an Evil Supervillain, who will derail the project (but, to be fair, will not destroy the entire planet. Probably).

Aspect: "My monster lives!"
Traits: Single-mindedness, Visionary

Lab AssistantThe Lab Assistant is vital to delivering the work of a Mad Scientist. They document APIs on wiki pages, they write build scripts and unit tests, they productionize the PoC code. In short, they undertake all the tedious essential tasks which would distract a Mad Scientist from their creation.

A Lab Assistant to an Evil Supervillain is still a Lab Assistant, but the wiki pages are half-complete, the unit tests don't run and the code isn't fit to be checked into source control.

Aspect: "Here is the brain you wanted"
Traits: Flexibility, Service to the higher cause

Major-domoThe Major-domo helps the project run smoothly by taking care of all the little things everybody else forgets. They clean the whiteboard before a meeting starts, they bring Sharpies and Post-It notes to the retrospective, they write Jiras for the stories we just agreed we needed and they circulate minutes after decision-making meetings.

Unless other people occasionally do some of these tasks a Major-domo will become a Resentful Skivvy.

Aspect: "I'll add that to my To-Do list"
Traits: Well-stocked stationary cupboard, Scrivener

Court JesterThe Court Jester says out loud the things everybody else is thinking. They aren't afraid to appear ridiculous in order to make a point. Their role is to speak truth to power.

If they go too far a Jester becomes an Angry Ranter, ignored and shunned by everybody.

Aspect: "The true fool stays silent in the face of foolishness"
Traits: Humour, Insight, Lack of inhibition

Bounty HunterThe Bounty Hunter lives for finding and fixing bugs. They are never happier when writing test cases to reproduce a bug or stepping through lines of code in debug mode. They understand that fixing production code is more important than delivering a new feature.

Unless kept on a tight leash a Bounty Hunter will become a Mindless Delver or a Tinkerer.

Aspect: “To defeat the bug, we must understand the bug”
Traits: Sense of purpose, Perseverance, Debugging

DruidThe Druid has an understanding of ecosystem beyond our project's bounded context. They know what the business seeks to achieve and how our project furthers those goals. They also know about other projects in the organisation, and work to ensure our project integrates with them harmoniously.

Unless given a clear sense of our project's priorities and direction a Druid is still a Druid, just servicing the needs of other projects.

Aspect: "Listen to the trees, dude"
Traits: Awareness, Empathy, Balance

Multi-faceted charactersObviously these aren't main character classes. A project team comprises base classes such as Developers, Testers, Analysts, Architects, heck maybe even a Project Manager. What I list here are ancillary classes, which modify a base class. An Architect, a Developer, an Analyst or a Project Manager can benefit from having a touch of the Druid about them. Any Developer should spend some time being a Bounty Hunter or Scavenger. Different circumstances demand different class behaviours. When there's a major outage in Production we need Rogues to fix it, not Paladins muttering about process and sign-off. But after Production is back it is the Paladins who make sure the problem and its resolution are properly documented, and appropriate preventative measures put in place. Most people on the team will flow through several of these class behaviours, even over the course a single sprint.

When we're forming a new team to deliver some piece of software we focus on the hard skills. the main character classes. We need this many Developers, this many Analysts, a UX expert, an SEO specialist, and so forth. These are the easy things to define. But the success of the project will in large part depend on the soft skills and temperaments of the individuals in the team. This is a lot harder to measure. It's why personality tests like Myers-Briggs and Insights exist: some people think they're hokum but they provide a framework for assessing the make-up of a team in an age when we're uncomfortable casting horoscopes or taking auguries from the liver of a freshly-slaughtered goose. Using RPG character classes as metaphors for desirable behaviours has the advantage of jokiness. There is a categorical absence of pyschological research underpinning this article. Also it doesn't require us to obtain live waterfowl.

One last thing. The next time you find yourself at a retrospective with no marker pens and nothing to write on, look around for a Major-domo. And if you can't spot one why not appoint yourself to the role? Epilogue That final paragraph makes me sad for the times when retrospectives happened in a room with other people, with a whiteboard covered in post-It notes. Let's hope we can do them like that again.

How PL/SQL Development Standards work

Tue, 2020-08-25 09:10
I have been gigging at a place which has documented PL/SQL Development Standards. This is not so unusual: most Oracle shops have such a document. What makes it unusual is that they enforce the standards. With code reviews. And I mean properly enforce: programs fail QA for egregious breaches of the standards or a sufficient accumulation of minor breaches. This is less common than it ought to be.

Many coders are sceptical about development standards; I have been in the past. Standards generally focus on things which are easy to standardise (indentation, case, naming conventions) rather functional correctness or design principles. They frequently codify arbitrary or outdated practices (mandating explicit cursors is a particular bugbear of mine). They either go into so much detail that they are unreadably long (and dull) or are so sketchy that they operate as easy-to-ignore guidelines. But I think many experienced developers' objections boil down to: I don't like being told how to write my code; my style is the best style; my code is clean, clear and readable.

The catch is, readability is not simply a function of personal style: it emerges from consistency across the entire codebase. Just because I find my personal coding style clear doesn't mean everbody else will. At the very least a colleague reading my program will have to invest time in understanding how I name my variables, how I use table aliases, and a dozen other things, none of them important individually but all together adding friction to the crucial task of understanding how a program works (or does not work).

This particular set of standards certainly had a lot to say about layout. Many strictures fitted with my natural coding style (all lower case, one column per line in a SQL projection, comma before the column name rather than after it). Others were rather tiresome: the rules for clause alignment entail a lot of spacing and backspacing to ensure elements line up. There are a few strictures I actively disagree with (notably mandatory use of SQL-89 syntax i.e. impicit joins). But here's the rub: I didn't get to pick and choose which of the standards I followed. I just had to knuckle down and follow them all. Because the discipline of the code review meant my programs failed QA when I hadn't applied the standards.

There's more to consistency than just layout and naming conventions. There's also functional consistency: use of SQL and PL/SQL idioms, how to organise programs within a package, and so forth. Too many things to cover in a single document. But again, code reviews enforce standardisation of these aspects, by applying undocumented conventions with the same rigour as documented standards. A couple of times I tripped over such an undocumented convention and it didn't feel fair: my code failed the review because I wrote something which was wrong even though not explicitly covered by the standards. One of these times it was something awry in the layout. "That's wrong", the reviewer said. It was a difference I hadn't even noticed, and probably you wouldn't have noticed either, and even if I had have noticed it I wouldn't have thought it was wrong. But it was different from what everybody else was doing. That made it wrong.

Everybody undertakes code reviews and everybody's code is reviewed. Thus code reviews shape the codebase, by enforcing documented standards and undocumented conventions. As a result this is the most readable codebase I have ever worked on. It's almost impossible to tell who wrote any given program, because all programs look the same. It's easy to reason about a piece of code because it follows rigorous naming conventions and consistent architectural principles. The code is habitable. A colleague can read a program I wrote and feel comfortable doing so. The layout, the naming conventions, the consistent selection of one approach in situations where PL/SQL offers more than one way of doing something, all these factors mean my program looks just like the program anybody else would have written. So the reader is freer to understand what the program actually does and how it works. Standardisation reduces friction.

It is a virtuous circle. Code reviews enforce a consistent programming style, which eliminates trivial (i.e. non-functional) differences in the program. In turn this makes the program easier to review: all the programs look basically the same which highlights the things which need to be different, the business logic and the data structures.

Readability is a featureReadility is a feature. It's a feature our code must have. We all know readability makes code easier to maintain, easier to re-use, easier to debug. Yet still many developers bridle at the suggestion that their PL/SQL must look like everybody else's PL/SQL. I get this. It's not that I think the way I write PL/SQL is intrinsically correct, it just looks the way I have evolved to write it over the years. A new set of coding standards, rigorously applied, disrupts my flow. I must slow down to correct the variable names or fix the layout. It's tedious.

Tedious but also necessary. A sofware system is a shared enterprise. It's not "my" code, it's the project's code; I am just the person checking it into source control. As a discipline, programming is a craft not an art. PL/SQL is simply a device for turning data into business value. It's more important that other people on the team can work with our code than that it has our signature style. So let's not be precious about appearance. We must follow the rules, and save our self-expression for our poems and our tweets.

Above all, know this: there are no development standards without code reviews.

Minimal declaration of foreign key columns

Thu, 2020-07-30 08:21
Here is the full declaration of an inline foreign key constraint (referencing a primary key column on a table called PARENT):

, parent_id number(12,0) constraint chd_par_fk foreign key references parent(parent_id)
But what is the fewest number of words required to implement the same constraint? Two. This does exactly the same thing:

, parent_id references parent
The neat thing about this minimalist declaration is the child column inherits the datatype of the referenced primary key column. Here's what it looks like (with an odd primary key declaration, just to prove the point):
SQL> create table parent1
2 (parent_id number(15,3) primary key)
3 /

Table PARENT1 created.

SQL> create table child1
2 ( id number(12,0) primary key
3 ,parent_id references parent1)
4 /

Table CHILD1 created.

SQL> desc child1
Name Null? Type
--------- -------- ------------
If we want to specify a name for the foreign key we need to include the constraint keyword:
SQL> create table parent2
2 (parent_id number(15,3) constraint par1_pk primary key)
3 /

Table PARENT2 created.

SQL> create table child2
2 ( id number(12,0) constraint chd2_pk primary key
3 ,parent_id constraint chd2_par2_fk references parent2)
4 /

Table CHILD2 created.

SQL> desc child2
Name Null? Type
--------- -------- ------------
This minimal declaration always references the parent table's primary key. Suppose we want to reference a unique key rather than the primary key. (I would regard this as a data model smell, but sometimes we need to do it.) To make this work we need merely explicitly reference the unique key column:

SQL> create table parent3
2 ( parent_id number(15,3) constraint par3_pk primary key
3 ,parent_ref varchar2(16) not null constraint par3_uk unique
4 )
5 /

Table PARENT3 created.

SQL> create table child3
2 ( id number(12,0) constraint chd3_pk primary key
3 ,parent_ref constraint chd3_par3_fk references parent3(ref))
4 /

Table CHILD3 created.

SQL> desc child3
Name Null? Type
---------- -------- ------------
Hmmm, neat. What if we have a compound primary key? Well, that's another data model smell but it still works. Because we're constraining multiple columns we need to use a table level constraint and so the syntax becomes more verbose; we need to include the magic words foreign key:

SQL> create table parent4
2 ( parent_id number(15,3)
3 ,parent_ref varchar2(16)
4 ,constraint par4_pk primary key (id, ref)
5 )
6 /

Table PARENT4 created.

SQL> create table child4
2 ( id number(12,0) constraint chd4_pk primary key
3 ,parent_id
4 ,parent_ref
5 ,constraint chd4_par4_fk foreign key (parent_id, parent_ref) references parent4)
6 /

Table CHILD4 created.

SQL> desc child4
Name Null? Type
---------- -------- ------------
Okay, but supposing we change the declaration of the parent column, does Oracle ripple the change to the child table?
SQL> alter table parent4 modify parent_ref varchar2(24);

Table PARENT4 altered.

SQL> desc child4
Name Null? Type
---------- -------- ------------
Nope. And rightly so. This minimal syntax is a convenience when we're creating a table, but there's no object-style inheritance mechanism.

Generally I prefer a verbose declaration over minimalism, because clarity trumps concision. I appreciate the rigour of enforcing the same datatype on both ends of a foreign key constraint. However, I hope that in most cases our CREATE TABLE statements have been generated from a data modelling tool. So I think this syntactical brevity is a neat thing to know about, but of limited practical use.

UKOUG London Development and Middleware event - free!

Mon, 2018-09-03 02:56
The Oracle development landscape is an extremely broad and complicated one these days. It covers such a wide range of tools, technologies and practices it is hard to keep up.

The UKOUG is presenting a day of sessions which can bring you up to speed. It's a joint initiative between the Development and Middleware SIGs - a composite if you will - at the Oracle City Office on Thursday 6th September. This event is free. If you are a UKOUG member attending it won't count against your allotment of SIG delegates; if you're not a UKOUG member there's no charge so come along and get a taste of what the UKOUG has to offer.

The day covers a broad spectrum. Martin Beeby is a popular speaker; his talk covers how Oracle is embracing new cool technologies such as Blockchain, Docker and chatbots. There are talks from Oracle ACE Director Simon Haslam on mobile applications and Oracle ACE Director Mark Simpson on real-life uses for AI. There are also sessions on API design, building bots and JavaScript frameworks.

Even last year these things might have been considered cutting edge, certainly in the enterprise realm. But most organisations of whatever size are at least thinking about or running Proof of Concept projects in AI or blockchain. Some already have these technologies active in Production. These things will affect everybody working in IT, and probably sooner rather than later. It's always good to know what's coming.

Check out the full agenda here.
Register here.

Oh, and did I mention it's free? Treat yourself to a day out from the present and get a glimpse of the future.

The Single Responsibility principle

Thu, 2018-05-31 16:14
The Single Responsibility principle is the foundation of modular programming, and is probably the most important principle in the SOLID set. Many of the other principles flow from it.

It is quite simple: a program unit should do only one thing. A procedure should implement a single task; a package should gather together procedures which solve a set of related tasks. Consider the Oracle library package UTL_FILE. Its responsibility is quite clear: it is for working with external files. It implements all the operations necessary to work with OS files: opening them, closing them, reading and writing, etc. It defines a bespoke suite of exceptions too.

Each procedure in the package has a clear responsibility too. For instance, fclose() closes a single referenced file whereas fclose_all() closes all open files. Now, the package designers could have implemented that functionality as a single procedure, with different behaviours depending on whether the file parameter was populated or unpopulated. This might seem a simpler implementation, because it would be one fewer procedure. But the interface has actually become more complicated: essentially we have a flag parameter, which means we need to know a little bit more about the internal processing of fclose(). It would have made the package just a little bit harder to work with without saving any actual code.

Of course, it's pretty easy to define the Single Responsibility of a low level feature like file handling. We might think there are some superficially similarities with displaying information to the screen but it's fairly obvious that these are unrelated and so we need tow packages, UTL_FILE and DBMS_OUTPUT. When it comes to our own code, especially higher level packages, it can be harder to define the boundaries. At the broadest level we can define domains - SALES, HR, etc. But we need more than one package per domain: how do we decide the responsibilities of indvidual pacakages?

Robert C Martin defines the Single Responsibility principle as: "A class should have only one reason to change." Reasons for change can be many and various. In database applications dependence on tables is a primary one. So procedures which work a common set of table may well belong together. But there are at least two sets of privileges for data: reading and manipulating. So it's likely we will need a package which gathers together reporting type queries which can be granted to read-only users and a package which executes DML statements which can be granted to more privileged users. Maybe our domain requires special processing, such as handling sensitive data; procedures for implementing that business logic will belong in separate packages.

Single responsibility becomes a matrix, with dependencies along one access and audience of users along another.

The advantages of Singel Responsibility should be obvious. It allows us to define a cohesive package, collecting together all the related functionality which makes it easy for others reuse it. It also allows us to define private routines in a package body, which reduces the amount of code we have to maintain while giving us a mechanism for preventing other developers from using it. Restricting the features to a single responsibility means unrelated functions are not coupled together. This gives a better granularity for granting the least privileges necessary to users of our code. Part of the Designing PL/SQL Programs series

UKOUG Northern Technology Summit 2018

Sun, 2018-04-15 02:55
The UKOUG has run something called the Northern Server Day for several years. Northern because they were held in a northern part of England (but south of Scotland) and Server because the focus was the database server. Over the last couple of years the day has had several streams, covering Database, High Availability and Engineered Systems. So primarily a day for DBAs and their ilk.

This year the event has expanded to let in the developers. Yay!

The Northern Technology Summit 2018 is effectively a mini-conference: in total there are five streams - Database, RAC Cloud Infrastructure & Availability, Systems, APEX and Development. But for registration it counts as a SIG. So it's free for UKOUG members to attend. What astonishingly good value!1 And it doesn't affect your entitlement to attend the annual conference in December. The Development streamThe Development stream covers a broad range of topics. Application development in 2018 is a tangled hedge with new technologies like Cloud, AI and NoSQL expanding the ecosystem but not displacing the traditional database and practices. The Development stream presents a mix of sessions from the new-fangled and Old Skool ends of the spectrum.

  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • Building a Real-Time Streaming Platform with Oracle, Apache Kafka, and KSQL No single technology is a good fit for all enterprise problems. Robin Moffat of Confluent will explain how we can use Apache Kafka to handle event-based data processing.
  • Modernising Oracle Forms Applications with Oracle Jet Oracle Forms was< - still is - a highly-productive tool for building OLTP front-ends. There are countless organisations still running Forms applications. But perhaps the UX looks a little jaded in 2018. So here's Mark Waite from Griffiths Waite to show how we can use Oracle's JET JavaScript library to write new UIs without having to re-code the whole Forms application.
  • 18(ish) Things Developers Will Love about Oracle Database 18c Oracle's jump to year-based release numbers doesn't make live easier for presenters: ten things about 10c was hard enough. But game for a challenge, Oracle's Chris Saxon attempts to squeeze as many new features as possible into his talk.
  • Modernize Your Development Experience With Oracle Cloud Cloud isn't just something for the sysadmins, there's a cloud for developers too. Sai Janakiram Penumuru from DXC Technology will explain how Oracle Developer Cloud might revolutionise your development practices.
  • Designing for Deployment As Joel Spolsky says, shipping is a feature. But it's a feature which is hard to retrofit. In this talk I will discuss some design principles which make it easier to build, deploy and ship database applications.
Everything elseSo I hope the Development stream offers a day of varied and useful ideas. There are things you might be able to use right now or in the next couple of months, and things which might shape what you'll be doing next year. But it doesn't matter if not everything floats your boat. The cool thing about the day is that delegates can attend any of the streams. 2 .

So you can listen to Nigel Bayliss talking about Optimisation in the Database Stream, Vipul Sharma: talking about DevOps in the Availability stream, Anthony talking about Kubernetes in the Systems stream and John Scott talking about using Docker with Oracle in the Apex stream. There are sessions on infrastructure as code, upgrading Oracle 12cR1 to 12cR2, GDPR (the new EU data protection law), the Apex Interactive grid, Apache Impala, and Cloud, lots of Cloud. Oh my!

The full agenda is here (pdf).Register nowSo if you're working with Oracle technology and you want to attend this is what you need to know:
  • Date: 16th May 2018
  • Location: Park Plaza Hotel, Leeds
  • Cost: Free for UKOUG members. There is a fee for non-members but frankly you might as well buy bronze membership package and get a whole year's work of access to UKOUG events (including the annual conference). It's a bargain.
1. The exact number of SIG passes depends on the membership package you have
2. The registration process requires you to pick a stream but that is just for administrative purposes. It's not a lock-in.

Data Access Layer vs Table APIs

Sun, 2017-12-31 11:59
One of the underlying benefits of PL/SQL APIs is the enabling of data governance. Table owners can shield their tables behind a layer of PL/SQL. Other users have no access to the tables directly but only through stored procedures. This confers many benefits:
  • Calling programs code against a programmatic interface. This frees the table owner to change the table's structure whenever it's necessary without affecting its consumers.
  • Likewise the calling programs get access to the data they need without having to know the details of the table structure, such as technical keys.
  • The table owner can use code to enforce complicated business rules when data is changed.
  • The table owner can enforce sophisticated data access policies (especially for applications using Standard Edition without DBMS_RLS).
So naturally the question arises, is this the same as Table APIs?

Table APIs used to be a popular approach to encapsulating tables. The typical Table API comprised two packages per table; one package provided methods for inserting, updating and deleting records, and the other package provided query methods. The big attraction of Table APIs was that they could be 100% generated from the data dictionary - both Oracle Designer and Steven Feuerstein's QNXO library provided TAPI generators. And they felt like good practice because, y'know, access to the tables was shielded by a PL/SQL layer.

But there are several problems with Table APIs.

The first is that they entrench row-by-agonising-row processing. Table APIs have their roots in early versions of Oracle so the DML methods only worked with a single record. Even after Oracle 8 introduced PL/SQL collection types TAPI code in the wild tended to be RBAR: there seems to something in the brain of the average programmer which predisposes them to prefer loops executing procedural code rather than set operations.

The second is that they prevent SQL joins. Individual records have to be selected from one table to provide keys for looking up records in a second table. Quite often this leads to loops within loops. So-called PL/SQL joins prevent the optimizer from choosing good access paths when handling larger amounts of data.

The third issue is that it is pretty hard to generate methods for all conceivable access paths. Consequently the generated packages had a few standard access paths (primary key, indexed columns) and provided an dynamic SQL method which accepted a free text WHERE clause. Besides opening the package to SQL injection this also broke the Law of Demeter: in order to pass a dynamic WHERE clause the calling program needed to know the structure of the underlying table, which defeats the whole objective of encapsulation.

Which leads on to the fourth, more philosophical problem with Table APIs: there is minimal abstraction. Each package is generated so it fits very closely to the structure of the Table. If the table structure changes we have to regenerate the TAPI packages: the fact that this can be done automatically is scant recompense for the tight coupling between the Table and the API.

So although Table APIs could be mistaken for good practice in actuality they provide no real benefit. The interface is 1:1 with the table structure so it has no advantage over granting privileges on the table. Combined with the impact of RBAR processing and PL/SQL joins on performance and the net effect of Table APIs is disastrous.

We cannot generate good Data Access APIs: we need to write them. This is because the APIs should be built around business functions rather than tables. The API packages granted to other users should comprise procedures for executing transactions. A Unit Of Work is likely to touch more than one table. These have to be written by domain experts who understand the data model and the business rules.

Part of the Designing PL/SQL Programs series

On hitting 100K on StackOverflow

Fri, 2017-12-29 12:14
100,000 is just another number. It's one more than 99,999. And yet, and yet. We live in a decimal cultural. We love to see those zeroes roll up. Order of magnitude baby! It's the excitement of being a child, going on a journey in the family car when the odometer reads 99994. knowing you'll see 100000. Of course everybody got distracted by the journey and next time you look at the dial it reads 100002.

Earlier this year my StackOverflow reputation passed 100,000. Like the car journey I missed the actual moment. My rep had been 99,986 when I last checked the previous evening and 100,011 the next day. Hey ho.

Reputation is a big deal on StackOverflow because it is the prime measure of contribution. As a Q&A site (not a forum - that confuses a lot of people) it needs content, it needs good questions and good answers. Reputation points are the reward for good posts. In this context good is determined democratically: people vote up good questions and good answers, and - crucially - vote down poor questions and answers. Votes are the main way of gaining reputation points: +5 for an upvoted question, +10 for an upvoted answer and +15 for an accepted answer. (There are other ways of gaining - and losing - rep) but posting is the main one. "Reputation is a rough measurement of how much the community trusts you; it is earned by convincing your peers that you know what you’re talking about." Meta Stack Exchange FAQ

So is reputation just a way of keeping score? Nope: it is gamification but there is more to it than that. Reputation means points and what do points make? Prizes Privileges. StackOverflow is largely a self-policing community. There are full-on (elected) moderators but most moderation is actually carried out by regular SO users with sufficient rep. Somebody has asked an unclear question: once you have 50 rep you can post a comment asking for clarification. Got a user who doesn't know how to turn off the CAPSLOCK key? With 2000 rep you can just edit their post and apply sentence case. And so on.

Hmmm, so StackOverflow rewards its keenest contributors by allowing them to do chores around the site. Yes and it works. One of the big problems with forums is other users. Not griefers as such but there are a lot of low-level irritations: users who don't know how to search the site, or how to format their posts, or just generally fail to understand etiquette. Granting increasing moderation privileges at reputation milestones allows committed users to smooth away soem of those irritations.

But still, getting to 100,000 took eight years and almost 3000 answers. Was it worth it? Well, there are no prizes but when you get to 100,000 you do get swag. A big box of swag:

Here is the box with a standard reference pear so you can see just how big it is.

Inside there is - a pen ....

Some stickers ....

A StackOverflow T-shirt (I have negotiated with my better half to keep this one) ...

And an over-sized coffee mug...

One more thing. There are also badges. Badges are nudges to encourage desirable behaviour such as editing posts, voting in moderator elections, reviewing posts, offering bounties, being awesome. Because let's face it, badges are cool. More badges = more flair. And who doesn't want more flair?Got flair? Heck yeah!

profile for APC at Stack Overflow, Q&A for professional and enthusiast programmers

Avoiding Coincidental Cohesion

Wed, 2017-05-31 17:10
Given that Coincidental Cohesion is bad for our code base so obviously we want to avoid writing utilities packages. Fortunately it is mostly quite easy to do so. It requires vigilance on our part. Utilities packages are rarely planned. More often we are writing a piece of business functionality when we find ourselves in need of some low level functionality. It doesn't fit in the application package we're working on, perhaps we suspect that it might be more generally useful, so we need somewhere to put it.

The important thing is to recognise and resist the temptation of the Utilities package. The name itself (and similarly vague synonyms like helper or utils) should be a red flag. When we find ourselves about to type create or replace package utilities we need to stop and think: what would be a better name for this package? Consider whether there are related functions we might end up needing? Suppose we're about to write a function to convert a date into Unix epoch string. It doesn't take much imagine to think we might need a similar function to convert a Unix timestamp into a date. We don't need to write that function now but let's start a package dedicated to Time functions instead of a miscellaneous utils package.

Looking closely at the programs which comprise the DBMS_UTILITY package it is obviously unfair to describe them as a random selection. In fact that there seven or eight groups of related procedures.

DB Info

  • DBLINK_ARRAY Table Type
  • DB_VERSION Procedure
  • PORT_STRING Function
Runtime Messages
Object Management
  • COMMA_TO_TABLE Procedures
  • COMPILE_SCHEMA Procedure
  • INVALIDATE Procedure
  • TABLE_TO_COMMA Procedures
  • VALIDATE Procedure
Object Info (Object Management?)
  • LNAME_ARRAY Table Type
  • NAME_ARRAY Table Type
  • NUMBER_ARRAY Table Type
  • UNCL_ARRAY Table Type
  • CANONICALIZE Procedure
  • GET_DEPENDENCY Procedure
  • NAME_RESOLVE Procedure
  • NAME_TOKENIZE Procedure
Session Info
SQL Manipulation
  • EXPAND_SQL_TEXT Procedure
  • GET_SQL_HASH Function
Statistics (deprecated))
  • ANALYZE_SCHEMA Procedure
  • GET_CPU_TIME Function
  • GET_TIME Function
  • GET_HASH_VALUE Function
  • IS_BIT_SET Function

We can see an alternative PL/SQL code suite, with several highly cohesive packages. But there will be some procedures which are genuinely unrelated to anything else. The four procedures in the Unclassified section above are examples. But writing a miscellaneous utils package for these programs is still wrong. There are better options.

  1. Find a home. It's worth considering whether we already have a package which would fit the new function. Perhaps WAIT_ON_PENDING_DML() should have gone in DBMS_TRANSACTION; perhaps IS_BIT_SET() properly belongs in UTL_RAW.
  2. A package of their own. Why not? It may seem extravagant to have a package with a single procedure but consider DBMS_DG with its lone procedure INITIATE_FS_FAILOVER(). The package delivers the usual architectural benefits plus it provides a natural home for related procedures we might discover a need for in the future.
  3. Standalone procedure. Again, why not? We are so conditioned to think of a PL/SQL program as a package that we forget it can be just a Procedure or Function. Some programs are suited to standalone implementation.

So avoiding the Utilities package requires vigilance. Code reviews can help here. Preventing the Utilities package becoming entrenched is crucial: once we have a number of packages dependent on a Utilities package it is pretty hard to get rid of it. And once it becomes a fixture in the code base developers will consider it more acceptable to add procedures to it.

Part of the Designing PL/SQL Programs series

Utilities - the Coincidental Cohesion anti-pattern

Wed, 2017-05-31 15:46
One way to understand the importance of cohesion is to examine an example of a non-cohesive package, one exhibiting a random level of cohesion. The poster child for Coincidental Cohesion is the utility or helper package. Most applications will have one or more of these, and Oracle's PL/SQL library is no exception. DBMS_UTILITY has 37 distinct procedures and functions (i.e. not counting overloaded signatures) in 11gR2 and 38 in 12cR1 (and R2). Does DBMS_UTILITY deliver any of the benefits the PL/SQL Reference says packages deliver? Easier Application Design? One of the characteristics of utilities packages is that they aren't designed in advance. They are the place where functionality ends up because there is no apparently better place for it. Utilities occur when we are working on some other piece of application code; we discover a gap in the available functionality such as hashing a string. When this happens we generally need the functionality now: there's little benefit to deferring the implementation until later. So we write a GET_HASH_VALUE() function,x stick it in our utilities package and proceed with the task at hand.

The benefit of this approach is we keep our focus on the main job, delivering business functionality. The problem is, we never go back and re-evaluate the utilities. Indeed, now there is business functionality which depends on them: refactoring utilities introduces risk. Thus the size of the utilities package slowing increases, one tactical implementation at a time. Hidden Implementation Details? Another characteristic of utility functions is that they tend not to share concrete implementations. Often a utilities package beyond a certain size will have groups of procedures with related functionality. It seems probable that DBMS_UTILITY.ANALYZE_DATABASE(), DBMS_UTILITY.ANALYZE_PART_OBJECT() and DBMS_UTILITY.ANALYZE_SCHEMA() share some code. So there are benefits to co-locating them in the same package. But it is unlikely that CANONICALIZE() , CREATE_ALTER_TYPE_ERROR_TABLE() and GET_CPU_TIME() have much code in common. Added Functionality? Utility functions are rarely part of a specific business process. They are usually called on a one-off basis rather than being chained together. So there is no state to be maintained across different function calls. Better Performance? For the same reason there is no performance benefit from a utilities package. Quite the opposite. When there is no relationship between the functions we cannot make predictions about usage. We are not likely to call EXPAND_SQL_TEXT() right after calling PORT_STRING(). So there is no benefit in loading the former into memory when we call the latter. In fact the performance of EXPAND_SQL_TEXT() is impaired because we have to load the whole DBMS_UTILITY package into the shared pool, plus it uses up a larger chunk of memory until it gets aged out. Although to be fair, in these days of abundant RAM, some unused code in the library cache need not be our greatest concern. But whichever way we bounce it, it's not a boon. Grants? Privileges on utility packages is a neutral concern. Often utilities won't be used outside the owning schema. In cases where we do need to make them more widely available we're probably granting access on some procedures that the grantee will never use. Modularity? From an architectural perspective, modularity is the prime benefit of cohesion. A well-designed library should be frictionless and painless to navigate. The problem with random assemblages like DBMS_UTILITY is that it's not obvious what functions it may contain. Sometimes we write a piece of code we didn't need to. The costs of utility packagesPerhaps your PL/SQL code base has a procedure like this:

create or replace procedure run_ddl
( p_stmt in varchar2)
pragma autonomous_transaction;
v_cursor number := dbms_sql.open_cursor;
n pls_integer;
dbms_sql.parse(v_cursor, p_stmt, dbms_sql.native);
n := dbms_sql.execute(v_cursor);
when others then
if dbms_sql.is_open(v_cursor) then
end if;
end run_ddl;

It is a nice piece of code for executing DDL statements. The autonomous_transaction pragma prevents the execution of arbitrary DML statements (by throwing ORA-06519), so it's quite safe. The only problem is, it re-implements DBMS_UTILITY.EXEC_DDL_STATEMENT().

Code duplication like this is a common side effect of utility packages. Discovery is hard because their program units are clumped together accidentally. Nobody sets out to deliberately re-write DBMS_UTILITY.EXEC_DDL_STATEMENT(), it happens because not enough people know to look in that package before they start coding a helper function. Redundant code is a nasty cost of Coincidental Cohesion. Besides the initial wasted effort of writing an unnecessary program there are the incurred costs of maintaining it, testing it, the risk of introducing bugs or security holes. Plus each additional duplicated program makes our code base a little harder to navigate.

Fortunately there are tactics for avoiding or dealing with this. Find out more.

Part of the Designing PL/SQL Programs series

UKOUG Tech 2016 - Super Sunday

Mon, 2016-12-05 07:48
UKOUG 2016 is underway. This year I'm staying at the Jury's Inn hotel, one of a clutch of hotels within a stone's throw of the ICC. Proximity is the greatest luxury. My room is on the thirteenth floor, so I have a great view across Birmingham; a view, which in the words of Telly Savalas "almost takes your breath away".

Although the conference proper - with keynotes, exhibition hall and so on - opens today, Monday, the pre-conference Super Sunday has already delivered some cracking talks. For the second year on the trot we have had a stream devoted to database development, which is great for Old Skool developers like me. Fighting Bad PL/SQL, Phillip Salvisberg The first talk in the stream discussed various metrics for assessing the the quality of PL/SQL code: McCabe Cyclic Complexity, Halstead Volume, Maintainability Index. Cyclic Complexity evaluates the number of paths through a piece of code; the more paths the harder it is to understand what the code does under any given circumstance. The volume approach assesses information density (the number of distinct words/total number of words); a higher number means more concepts, and so more to understand. The Maintainability Index takes both measures and throws it some extra calculations based on LoC and comments.

All these measures are interesting, and often insights but none are wholly satisfactory. Phillip showed how easier it is to game the MI by putting all the code of a function on a single line: the idea that such a layout makes our code more maintainable is laughable. More worryingly, none of these measures evaluate what the code actually does. The presented example of better PL/SQL (according to the MI measure) replaced several lines of PL/SQL into a single REGEXP_LIKE call. Regular expressions are notorious for getting complicated and hard to maintain. Also there are performance considerations. Metrics won't replace wise human judgement just yet. In the end I agree with Phillip that the most useful metric remains WTFs per minute. REST enabling Oracle tables with Oracle REST Data Services, Jeff SmithIt was standing room only for That Jeff Smith, who coped well with jetlag and sleep deprivation. ORDS is the new name for the APEX listener, a misleading name because it is used for more than just APEX calls, and APEX doesn't need it. ORDS is a Java application which brokers JSON calls between a web client and the database: going one way it converts JSON payload into SQL statements, going the other way it converts result sets into JSON messages. Apparently Oracle is going to REST enable the entire database - Jeff showed us the set of REST commands for managing DataGuard. ORDS is the backbone of Oracle Cloud.

Most of the talk centred on Oracle's capabilities for auto-enabling REST access to tables (and PL/SQL with the next release of ORDS). This is quite impressive and certainly I can see the appeal of standing up a REST web service to the database without all the tedious pfaffing in Hibernate or whatever Java framework is in place. However I think auto-enabling is the wrong approach. REST calls are stateless and cannot be assembled to form transactions; basically each one auto-commits. It's Table APIs all over again. TAPI 2.0, if you will. It's a recipe for bad applications.

But I definitely like this vision of the future: an MVC implementation with JavaScript clients (V) passing JSON payloads to ORDS (C) with PL/SQL APIs doing all the business logic (M). The nineties revival starts here. Meet your match: advanced row pattern matching, Stew AshtonStew's talk was one of those ones which are hard to pull off: Oracle 12c's MATCH RECOGNIZE clause is a topic more suited to an article with a database on hand so we can work through the examples. Stew succeeded in making it work as a talk because he's a good speaker with a nice style and a knack for lucid explanation. He made a very good case for the importance of understanding this arcane new syntax.

MATCH RECOGNIZE is lifted from event processing. It allows us to define arbitrary sets of data which we can iterate over in a SELECT statement. This allows us to solve several classes of problems relating to bin filtering, positive and negative sequencing, and hierarchical summaries. The most impressive example showed how to code an inequality (i.e. range) join that performs as well as an equality join. I will certainly be downloading this presentation and learning the syntax when I get back home.

If only Stew had done a talk on the MODEL clause several years ago. SQL for change history with Temporal Validity and Flash Back Data Archive, Chris SaxonChris Saxon tackled the tricky concept of time travel in the database, as a mechanism for handling change. The first type of change is change in transactional data. For instance, when a customer moves house we need to retain a record of their former address as well as their new one. We've all implemented history like this, with START_DATE and END_DATE columns. The snag has always been how to formulate the query to establish which record applies at a given point in time. Oracle 12C solves this with Temporal Validity, a syntax for defining a PERIOD using those start and end dates. Then we can query the history using a simple AS OF PERIOD clause. It doesn't solve all the problems in this area (primary keys remain tricky) but at least the queries are solved.

The other type of change is change in metadata: when was a particular change applied? what are all the states of a record over the last year? etc. These are familiar auditing requirements, which are usually addressed through triggers and journalling tables. That approach carries an ongoing burden of maintenance and is too easy to get wrong. Oracle has had a built-in solution for several years now, Flashback Data Archive. Not enough people use it, probably because in 11g it was called Total Recall and a chargeable extra. In 12C Flashback Data Archive is free; shorn of the data optimization (which requires the Advanced Compression package) it is available in Standard Edition not just Enterprise. And it's been back-ported to The syntax is simple: to get a historical version of the data we simply use AS OF TIMESTAMP. No separate query for a journalling table, no more nasty triggers to maintain... I honestly don't know why everybody isn't using it.

So that was Super Sunday. Roll on Not-So-Mundane Monday.

UKOUG Conference 2016 coming up fast

Thu, 2016-11-24 02:19
The weather has turned cold, the lights are twinkling in windows and Starbucks is selling pumpkin lattes. Yes, it's starting to look a lot like Christmas. But first there's the wonder-filled advent calendar that is the UKOUG Annual Conference in Birmingham, UK. So many doors to choose from!

The Conference is the premier event for Oracle users in the UK (and beyond). This year has another cracker of an agenda: check it out.

The session I'm anticipating most is Monday's double header with Bryn Llewellyn and Toon Koopelaar's A Real-World Comparison of the NoPLSQL & Thick Database Paradigms. Will they come down on the side of implementing business logic in stored procedures or won't they? It'll be tense :) But it will definitely be insightful and elegantly argued.

Oracle's bailiwick has expanded vastly over the years, and it's become increasingly hard to cover everything. Even so, it's fair to say in recent years older technologies such as Forms have been neglected in favour in favour of shinier baubles. Not this year. There's a good representation of Forms sessions this year, including a talk from Michael Ferrante, the Forms Product Manager. These sessions are all scheduled for the Wednesday, in a day targeted at database developers. If you're an Old Skool developer, especially if you're a Forms developer, and your boss will allow you only one day at the conference, then Wednesday is the day to pick.

Hope to see you there

Designing PL/SQL Programs: Series home page

Wed, 2016-04-20 01:57
Designing PL/SQL Programs is a succession of articles published the articles in a nonlinear fashion. Eventually it will evolve into a coherent series. In the meantime this page serves as a map and navigation aid. I will add articles to it as and when I publish them.
IntroductionDesigning PL/SQL Programs
It's all about the interface
Principles and PatternsIntroducing the SOLID principles
Introducing the RCCASS principles
Three more principles
The Dependency Inversion Principle: a practical example
Working with the Interface Segregation Principle Software ArchitectureThe importance of cohesionInterface designTools and Techniques

The importance of cohesion

Wed, 2016-04-20 01:56
"Come on, come on, let's stick together" - Bryan Ferry

There's more to PL/SQL programs than packages, but most of our code will live in packages. The PL/SQL Reference offers the following benefits of organising our code into packages:

Modularity - we encapsulate logically related components into an easy to understand structure.

Easier Application Design - we can start with the interface in the package specification and code the implementation later.

Hidden Implementation Details - the package body is private so we can prevent application users having direct access to certain functionality.

Added Functionality - we can share the state of Package public variables and cursors for the life of a session.

Better Performance - Oracle Database loads the whole package into memory the first time you invoke a package subprogram, which makes subsequent invocations of any other subprogram quicker. Also packages prevent cascading dependencies and unnecessary recompilation.

Grants - we can grant permission on a single package instead of a whole bunch of objects.

However, we can only realise these benefits if the packaged components belong together: in other words, if our package is cohesive.  

The ever reliable Wikipedia defines cohesion like this: "the degree to which the elements of a module belong together"; in other words how it's a measure of the strength of the relationship between components. It's common to think of cohesion as a binary state - either a package is cohesive or it isn't - but actually it's a spectrum. (Perhaps computer science should use  "cohesiveness" which is more expressi but cohesion it is.)
CohesionCohesion owes its origin as a Comp Sci term to Stevens, Myers, and Constantine.  Back in the Seventies they used the terms "module" and "processing elements", but we're discussing PL/SQL so let's use Package and Procedure instead. They defined seven levels of cohesion, with each level being better - more usefully cohesive - than its predecessor.
CoincidentalThe package comprises an arbitrary selection of procedures and functions which are not related in any way. This obviously seems like a daft thing to do, but most packages with "Utility" in their name fall into this category.
LogicalThe package contains procedures which all belong to the same logical class of functions. For instance, we might have a package to collect all the procedures which act as endpoints for REST Data Services.
TemporalThe package consists of procedures which are executed at the same system event. So we might have a package of procedures executed when a user logs on - authentication, auditing, session initialisation - and similar package for tidying up when the user logs off. Other than the triggering event the packaged functions are unrelated to each other.
ProceduralThe package consists of procedures which are executed as part of the same business event. For instance, in an auction application there are a set of actions to follow whenever a bid is made: compare to asking price, evaluate against existing maximum bid, update lot's status, update bidder's history, send an email to the bidder, send an email to the user who's been outbid, etc.
CommunicationalThe package contains procedures which share common inputs or outputs. For example a payroll package may have procedures to calculate base salary, overtime, sick pay, commission, bonuses and produce the overall remuneration for an employee.
SequentialThe package comprises procedures which are executed as a chain, so that the output of one procedure becomes the input for another procedure. A classic example of this is an ETL package with procedures for loading data into a staging area, validating and transforming the data, and then loading records into the target table(s).
FunctionalThe package comprises procedures which are focused on a single task. Not only are all the procedures strongly related to each other but they are fitted to user roles too. So procedures for power users are in a separate package from procedures for normal users. The Oracle built-in packages for Advanced Queuing are a good model of Functional cohesion.
How cohesive is cohesive enough?The grades of cohesion, with Coincidental as the worst and Functional as the best, are guidelines. Not every package needs to have Functional cohesion. In a software architecture we will have modules at different levels. The higher modules will tend to be composed of calls to lower level modules. The low level modules are the concrete implementations and they should aspire to Sequential or Functional cohesion.

The higher level modules can be organised to other levels. For instance we might want to build packages around user roles - Sales, Production, HR, IT - because Procedural cohesion makes it easier for the UI teams to develop screens, especially if they need to skin them for various different technologies (desktop, web, mobile). Likewise we wouldn't want to have Temporally cohesive packages with concrete code for managing user logon or logoff. But there is a value in organising a package which bundles up all the low level calls into a single abstract call for use in schema level AFTER LOGON triggers.    

Cohesion is not an easily evaluated condition. We need cohesion with a purpose, a reason to stick those procedures together. It's not enough to say "this package is cohesive". We must take into consideration how cohesive the package needs to be: how will it be used? what is its relationships with the other packages?

Applying design principles such as Single Responsibility, Common Reuse, Common Closure and Interface Segregation can help us to build cohesive packages. Getting the balance right requires an understanding of the purpose of the package and its place within the overall software architecture.  

Part of the Designing PL/SQL Programs series

Three more principles

Sun, 2016-04-03 13:00
Here are some more principles which can help us design better programs. These principles aren't part of an organized theory, and they're aren't particularly related to any programming paradigm. But each is part of the canon, and each is about the relationship between a program's interface and its implementation.
The Principle Of Least AstonishmentAlso known as the Principle of Least Surprise, the rule is simple: programs should do what we expect them to do. This is more than simply honouring the contract of the interface. It means complying with accepted conventions of our programming. In PL/SQL programming there is a convention that functions are read-only, or at least do not change database state. Another such convention is that low-level routines do not execute COMMIT statements; transaction management is the prerogative of the program at the top of the call stack, which may be interacting directly with a user or may be an autonomous batch process.

Perhaps the most common flouting of the Principle Of Least Astonishment is this:

when others then

It is reasonable to expect that a program will hurl an exception if something as gone awry. Unfortunately, we are not as astonished as we should be when we find a procedure with an exception handle which swallows any and every exception.
Information Hiding Principle Another venerable principle, this one was expounded by David Parnas in 1972. It requires that a calling program should not need to know anything about the implementation of a called program. The definition of the interface should be sufficient. It is the cornerstone of black-box programming. The virtue of Information Hiding is that knowledge of internal details inevitably leads to coupling between the called and calling routines: when we change the called program we need to change the caller too. We honour this principle any time we call a procedure in a package owned by another schema, because the EXECUTE privilege grants visibility of the package specification (the interface) but not the body (the implementation).
The Law Of Leaky AbstractionsJoel Spolsky coined this one: "All non-trivial abstractions, to some degree, are leaky." No matter how hard we try, some details of the implementation of a called program will be exposed to the calling programming, and will need to be acknowledged. Let's consider this interface again:

    function get_employee_recs
( p_deptno in number )
return emp_refcursor;

We know it returns a result set of employee records. But in what order? Sorting by EMPNO would be pretty useless, given that it is a surrogate key (and hence without meaning). Other candidates - HIREDATE, SAL - will be helpful for some cases and irrelevant for others. One approach is to always return an unsorted set and leave it to the caller to sort the results; but it is usually more efficient to sort records in a query rather than a collection. Another approach would be to write several functions - get_employee_recs_sorted_hiredate(), get_employee_recs_sorted_sal() - but that leads to a bloated interface which is hard to understand. Tricky.
ConclusionPrinciples are guidelines. There are tensions between them. Good design is a matter of trade-offs. We cannot blindly follow Information Hiding and ignore the Leaky Abstractions. We need to exercise our professional judgement (which is a good thing).

Part of the Designing PL/SQL Programs series

It's all about the interface

Sun, 2016-04-03 12:59
When we talk about program design we're mainly talking about interface design. The interface is the part of our program that the users interact with. Normally discussion of UI focuses on GUI or UX, that is, the interface with the end user of our application.

But developers are users too.

Another developer writing a program which calls a routine in my program is a user of my code (and, I must remember, six months after I last touched the program, I am that other developer). A well-designed interface is frictionless: it can be slotted into a calling program without too much effort. A poor interface breaks the flow: it takes time and thought to figure it out. In the worst case we have to scramble around in the documentation or the source code.

Formally, an interface is the mechanism which allows the environment (the user or agent) to interact with the system (the program). What the system actually does is the implementation: the interface provides access to the implementation without the environment needing to understand the details. In PL/SQL programs the implementation will usually contain a hefty chunk of SQL. The interface mediates access to data.

An interface is a contract. It specifies what the caller must do and what the called program will do in return. Take this example:

function get_employee_recs
     ( p_deptno in number )
     return emp_refcursor;

The contract says, if the calling program passes a valid DEPTNO the function will return records for all the employees in that department, as a strongly-typed ref cursor. Unfortunately the contract doesn't say what will happen if the calling program passes an invalid DEPTNO. Does the function return an empty set or throw an exception? The short answer is we can't tell. We must rely on convention or the document, which is an unfortunate gap in the PL/SQL language; the Java keyword throws is quite neat in this respect.
The interface is here to helpThe interface presents an implementation of business logic. The interface is a curated interpretation, and doesn't enable unfettered access. Rather, a well-designed interface helps a developer use the business logic in a sensible fashion. Dan Lockton calls this Design With Intent: Good design expresses how a product should be used. It doesn't have to be complicated. We can use simple control mechanisms which to help other developers use our code properly.
Restriction of accessSimply, the interface restricts access to certain functions or denies it altogether. Only certain users are allowed to view salaries, and even fewer to modify them. The interface to Employee records should separate salary functions from more widely-available functions. Access restriction can be implemented in a hard fashion, using architectural constructs (views, packages, schemas) or in a soft fashion (using VPD or Data Vault). The hard approach benefits from clarity, the soft approach offers flexibility.
Forcing functionsIf certain things must be done in a specific order then the interface should only offer a method which enforces the correct order. For instance, if we need to insert records into a parent table and a child table in the same transaction (perhaps a super-type/sub-type implementation of a foreign key arc) a helpful interface will only expose a procedure which inserts both records in the correct order.
Mistake-proofingA well-design interface prevents its users from making obvious mistakes. The signature of a procedure should be clear and unambiguous. Naming is important. If a parameter presents a table attribute the parameter name should echo the column name: p_empno is better than p_id. Default values for parameters should lead developers to sensible and safe choices. If several parameters have default values they must play nicely together: accepting all the defaults should not generate an error condition.
AbstractionAbstraction is just another word for interface. It allows us to focus on the details of our own code without need to understand the concrete details of the other code we depend upon. That's why good interfaces are the key to managing large codebases.

Part of the Designing PL/SQL Programs series

Working with the Interface Segregation Principle

Sun, 2016-04-03 12:55
Obviously Interface Segregation is crucial for implementing restricted access. For any given set of data there are three broad categories of access:

  • reporting 
  • manipulation 
  • administration and governance 

So we need to define at least one interface - packages - for each category in order that we can grant the appropriate access to different groups of users: read-only users, regular users, power users.

But there's more to Interface Segregation. This example is based on a procedure posted on a programming forum. Its purpose is to maintain medical records relating to a patient's drug treatments. The procedure has some business logic (which I've redacted) but its overall structure is defined by the split between the Verification task and the De-verification task, and flow is controlled by the value of the p_verify_mode parameter.
procedure rx_verification
(p_drh_id in number,
p_patient_name in varchar2,
p_verify_mode in varchar2)
new_rxh_id number;
rxh_count number;
rxl_count number;
drh_rec drug_admin_history%rowtype;
select * into drh_rec ....;
select count(*) into rxh_count ....;

if p_verify_mode = 'VERIFY' then

update drug_admin_history ....;
if drh_rec.pp_id <> 0 then
update patient_prescription ....;
end if;
if rxh_count = 0 then
insert into prescription_header ....;
select rxh_id into new_rxh_id ....;
end if;
insert into prescription_line ....;
if drh_rec.threshhold > 0
insert into prescription_line ....;
end if;

elsif p_verify_mode = 'DEVERIFY' then

update drug_admin_history ....;
if drh_rec.pp_id <> 0 then
update patient_prescription ....;
end if;
select rxl_rxh_id into new_rxh_id ....;
delete prescription_line ....;
delete prescription_header ....;

end if;
Does this procedure have a Single Responsibility?  Hmmm. It conforms to Common Reuse - users who can verify can also de-verify. It doesn't break Common Closure, because both tasks work with the same tables. But there is a nagging doubt. It appears to be doing two things: Verification and De-verification.

So, how does this does this procedure work as an interface? There is a definite problem when it comes to calling the procedure: how do I as a developer know what value to pass to p_verify_mode?

(p_drh_id => 1234,
p_patient_name => 'John Yaya',
p_verify_mode => ???);
The only way to know is to inspect the source code of the procedure. That breaks the Information Hiding principle, and it might not be viable (if the procedure is owned by a different schema). Clearly the interface could benefit from a redesign. One approach would be to declare constants for the acceptable values; while we're at it, why not define a PL/SQL subtype for verification mode and tweak the procedure's signature to make it clear that's what's expected:         

create or replace package rx_management is

subtype verification_mode_subt is varchar2(10);
c_verify constant verification_mode_subt := 'VERIFY';
c_deverify constant verification_mode_subt := 'DEVERIFY';

procedure rx_verification
(p_drh_id in number,
p_patient_name in varchar2,
p_verify_mode in verification_mode_subt);

end rx_management;
Nevertheless it is still possible for a caller program to pass a wrong value: 

(p_drh_id => 1234,
p_patient_name => 'John Yaya',
p_verify_mode => 'Verify');
What happens then? Literally nothing. The value drops through the control structure without satisfying any condition. It's an unsatisfactory outcome. We could change the implementation of rx_verification() to validate the parameter value and raise and exception. Or we could add an ELSE branch and raise an exception. But those are runtime exceptions. It would be better to mistake-proof the interface so that it is not possible to pass an invalid value in the first place.

Which leads us to to a Segregated Interface :
create or replace package rx_management is

procedure rx_verification
(p_drh_id in number,
p_patient_name in varchar2);

procedure rx_deverification
(p_drh_id in number);

end rx_management;
Suddenly it becomes clear that the original procedure was poorly named (I call rx_verification() to issue an RX de-verification?!)  We have two procedures but their usage is now straightforward and the signatures are cleaner (the p_patient_name is only used in the Verification branch so there's no need to pass it when issuing a De-verification).
SummaryInterface Segregation creates simpler and safer controls but more of them. This is a general effect of the Information Hiding principle. It is a trade-off. We need to be sensible. Also, this is not a proscription against flags. There will always be times when we need to pass instructions to called procedures to modify their behaviour. In those cases it is important that the interface includes a definition of acceptable values.

Part of the Designing PL/SQL Programs series

Introducing the SOLID design principles

Sun, 2016-04-03 12:55
PL/SQL programming standards tend to focus on layout (case of keywords, indentation, etc), naming conventions, and implementation details (such as use of cursors).  These are all important things, but they don't address questions of design. How easy is it to use the written code?  How easy is it to test? How easy will it be to maintain? Is it robust? Is it secure?

Simply put, there are no agreed design principles for PL/SQL. So it's hard to define what makes a well-designed PL/SQL program.
The SOLID principlesIt's different for object-oriented programming. OOP has more design principles and paradigms and patterns than you can shake a stick at. Perhaps the most well-known are the SOLID principles, which were first mooted by Robert C. Martin, AKA Uncle Bob, back in 1995 (although it was Michael Feathers who coined the acronym).

Although Martin put these principles together for Object-Oriented code, they draw on a broader spectrum of programming practice. So they are transferable, or at least translatable, to the other forms of modular programming. For instance, PL/SQL.
Single Responsibility PrincipleThis is the foundation stone of modular programming: a program unit should do only one thing. Modules which do only one thing are easier to understand, easier to test and generally more versatile. Higher level procedures can be composed of lower level ones. Sometimes it can be hard to define what "one thing" means in a given context, but some of the other principles provide clarity. Martin's formulation is that there should be just one axis of change: there's just one set of requirements which, if modified or added to, would lead to a change in the package.
Open/closed PrincipleThe slightly obscure name conceals a straightforward proposal. It means program units are closed to modification but open to extension. If we need to add new functionality to a package, we create a new procedure rather than modifying an existing one. (Betrand Meyer, the father of Design By Contract programming, originally proposed it; in OO programming this principle is implemented through inheritance or polymorphism.) Clearly we must fix bugs in existing code. Also it doesn't rule out refactoring: we can tune the implementation providing we don't change the behaviour. This principle mainly applies to published program units, ones referenced by other programs in Production. Also the principle can be looser when the code is being used within the same project, because we can negotiate changes with our colleagues.
Liskov Substitution PrincipleThis is a real Computer Science-y one, good for dropping in code reviews. Named for Barbara Liskov it defines rules for behavioural sub-typing. If a procedure has a parameter defined as a base type it must be able to take an instance of any sub-type without changing the behaviour of the program. So a procedure which uses
to test the type of a passed parameter and do something different is violating Liskov Substitution. Obviously we don't make much use of Inheritance in PL/SQL programming, so this Principle is less relevant than in other programming paradigms.
Interface Segregation PrincipleThis principle is about designing fine-grained interfaces. It is a extension of the Single Responsibility Principle. Instead of build one huge package which contains all the functions relating to a domain build several smaller, more cohesive packages. For example Oracle's Advanced Queuing subsystem comprises five packages, to manage different aspects of AQ. Users who write to or read from queues have
; users who manage queues and subscribers have
Dependency Inversion PrincipleInteractions between programs should be through abstract interfaces rather than concrete ones. Abstraction means the implementation of one side of the interface can change without changing the other side. PL/SQL doesn't support Abstract objects in the way that say Java does. To a certain extent Package Specifications provide a layer of abstraction but there can only be one concrete implementation. Using Types to pass data between Procedures is an interesting idea, which we can use to decouple data providers and data consumers in a useful fashion.
Applicability of SOLID principles in PL/SQLSo it seems like we can apply SOLID practices to PL/SQL.  True, some Principles fit better than others. But we have something which we might use to distinguish good design from bad when it comes to PL/SQL interfaces.

The SOLID principles apply mainly to individual modules. Is there something similar we can use for designing module groups? Why, yes there is. I'm glad you asked.

Part of the Designing PL/SQL Programs series

Introducing the RCCASS design principles

Sun, 2016-04-03 12:54
Rob C Martin actually defined eleven principles for OOP. The first five, the SOLID principles, relate to individual classes. The other six, the RCCASS principles, deal with the design of packages (in the C++ or Java sense, i.e. libraries). They are far less known than the first five. There are two reasons for this:

  • Unlike "SOLID", "RCCASS" is awkward to say and doesn't form a neat mnemonic. 
  • Programmers are far less interested in software architecture. 

Software architecture tends to be an alien concept in PL/SQL. Usually a codebase of packages simply accretes over the years, like a coral reef. Perhaps the RCCASS principles can help change that.
The RCCASS PrinciplesReuse Release Equivalency Principle The Reuse Release Equivalency Principle states that the unit of release matches the unit of reuse, which is the parts of the program unit which are consumed by other programs. Basically the unit of release defines the scope of regression testing for consuming applications. It's an ill-mannered release which forces projects to undertake unnecessary regression testing. Cohesive program units allow consumers to do regression testing only for functionality they actually use. It's less of a problem for PL/SQL because (unlike C++ libraries of Java jars) the unit of release can have a very low level of granularity: individual packages or stored procedures.
Common Reuse Principle The Common Reuse principle supports the definition of cohesive program units. Functions which share a dependency belong together, because they are likely to be used together belong together. For instance, procedures which maintain the Employees table should be co-located in one package (or a group of related packages). They will share sub-routines, constants and exceptions. Packaging related procedures together makes the package easier to write and easier for calling programs to use.
Common Closure PrincipleThe Common Closure principle supports also the definition of cohesive program units. Functions which share a dependency belong together, because they have a common axis of change. Common Closure helps to minimise the number of program units affected by a change. For instance, programs which use the Employees table may need to change if the structure of the table changes. All the changes must be released together: table, PL/SQL, types, etc.
Acyclic Dependencies Principle Avoid cyclic dependencies between program units: if package A depends on package B then B must not have a dependency on B. Cyclic dependencies make application hard to use and harder to deploy. The dependency graph shows the order in which objects must be built. Designing a dependency graph upfront is futile, but we can keep to rough guidelines. Higher level packages implementing business rules tend to depend on generic routines which in turn tend to depend on low-level utilities. There should be no application logic in those lower-level routines. If SALES requires a special logging implementation then that should be handled in the SALES subsystem not in the standard logging package.
Stable Dependencies Principle Any change to the implementation of a program unit which is widely used will generate regression tests for all the programs which call it. At the most extreme, a change to a logging routine could affect all the other programs in our application. As with the Open/Closed Principle we need to fix bugs. But new features should be introduced by extension not modification. And refactoring of low-level dependencies must not done on a whim.
Stable Abstractions PrincipleAbstractions are dependencies, especially when we're talking about PL/SQL. So this Principle is quite similar to Stable Dependencies Principle. The key difference is that this relates to the definition of interfaces rather than implementation. A change to the signature of a logging routine could require code changes to all the other programs in the application. Obviously this is even more inconvenient than enforced regression testing. Avoid changing the signature of a public procedure or the projection of a public view. Again, extension rather than modification is the preferred approach.
Applicability of RCCASS principles in PL/SQL The focus of these principles is the stability of a shared codebase, and minimising the impact of change on the consumers of our code. This is vital in large projects, where communication between teams is often convoluted. It is even more important for open source or proprietary libraries.

We we can apply Common Reuse Principle and Common Closure Principle to define the scope of the Reuse Release Equivalency Principle, and hence define the boundaries of a sub-system (whisper it, schema). Likewise we can apply the Stable Dependencies Principle and Stable Abstractions Principle to enforce the Acyclic Dependencies Principle to build stables PL/SQL libraries. So the RCCASS principles offer some most useful pointers towards a stable PL/SQL software architecture.

Part of the Designing PL/SQL Programs series