The Oracle Instructor

Subscribe to The Oracle Instructor feed The Oracle Instructor
002 - License to dink
Updated: 8 hours 28 min ago

Pickleball 002 – Einstieg ins Doppel

13 hours 44 min ago

Doppel ist sicherlich die beliebteste Variante in der Pickleball gespielt wird – viele spielen gar keine Einzel. Dieser kurze Artikel hilft vielleicht beim Einstieg. Das Beste vorweg: Pickleball im Doppel ist so ziemlich die inklusivste Form von Sport die man sich denken kann. Männer können gegen Frauen antreten, die ältere Generation gegen Jüngere, alte Hasen gegen Neueinsteiger. In praktisch jeder möglichen Kombination haben trotzdem alle ihren Spaß auf dem Feld und kommen auf ihre Kosten. Die Community ist gegenüber Neuankömmlingen sehr aufgeschlossen und wertschätzend. Das und die Inklusivität sind nach meiner Meinung auch die Hauptgründe, warum Pickleball so rapide an Popularität gewinnt.

Pickleball wird vorn entschieden

Im Tennis bin ich zwar eher ein Grundlinienspieler, aber das ist beim Pickleball nicht erfolgversprechend. Das kurze Spiel in der Nicht-Volley-Zone (NVZ) – auch Dinking genannt – und Volleys sind hier meistens spielentscheidend. Daher sollte man besonders Dinking und Volleys üben.

Üben für Ballsicherheit und Konsistenz

Übung macht den Pickleball-Meister, denn die meisten Spiele werden nicht so sehr gewonnen sondern durch unerzwungene Fehler verloren. Ballsicherheit ist Trumpf, natürlich auch und gerade im Doppel, wo man seinem Partner nicht gern viele unforced errors zumuten möchte. Eine gute Praxis ist es, vor dem eigentlichen Spiel zum Aufwärmen ein Kurz-Kurz-Spiel an der NVZ zu machen: Aufschlag diagonal, dann müssen erst mindestens 5 Dinks in die NVZ erfolgen, bevor der Ball freigegeben wird. Oder etwa, man muss erst 7 Volleys in Folge schaffen, bevor das Spiel anfängt.

Warum 0-0-2?

Die Zählweise im Doppel kann anfangs verwirrend sein. Ich hab mich jedenfalls zunächst etwas schwer getan und darum dies Video aufgenommen:

Hilfreich ist es außerdem, wenn man sich im Doppel merkt, auf welcher Seite man am Anfang gestanden hat. Stehe ich zum Beispiel anfangs auf der rechten (geraden) Seite, so werde ich zukünftig immer, wenn mein Team einen geraden Punktestand hat, rechts stehen. Also bei 0,2,4,6,8,10 für mein Team sollte ich immer rechts stehen. Bei 1,3,5,7,9,11 sollte entsprechend mein Partner auf der linken (ungeraden) Seite stehen. Als Gedächtnisstütze nehme ich Schweißbänder: Zwei wenn ich rechts anfange und eines wenn ich links anfange.

Elementare Doppel-Strategie

Die typische Verhaltensweise eines Doppel-Teams habe ich hier kurz geschildert. Natürlich gibt es noch mehr an Feinheiten, aber als Starthilfe sollte das erstmal reichen. Außerdem: So kompliziert ist Pickleball auch eigentlich nicht.

So, ich hoffe, das war hilfreich und ermutigend, um mit dem Pickleball spielen loszulegen – wir freuen uns schon darauf, euch auf dem Platz zu begegnen!

Categories: DBA Blogs

Impact of proper column precision for analytical queries

Fri, 2021-06-11 04:33

Does it matter if your data warehouse tables have columns with much higher precision than needed? Probably: Yes.

But how do you know the precision of your columns is larger than required by the values stored in these columns? In Exasol, we have introduced the function MIN_SCALE to find out. I’m working on an Exasol 7 New Features course at the moment, and this article is kind of a sneak preview.

If there’s an impact, it will show only with huge amounts of rows of course. Would be nice to have a table generator to give us large testing tables. Another Exasol 7 feature helps with that: The new clause VALUES BETWEEN.

SELECT CAST(1.123456789 AS DECIMAL(18,17)) AS testcol

This generates a table with 1000 million rows and takes only 30 seconds runtime on my modest VirtualBox VM. Obviously, the scale of the column is too large for the values stored there. But if it wouldn’t be that obvious, here’s how I can find out:

SELECT MAX(a) FROM (SELECT MIN_SCALE(testcol) As a FROM scaletest.t1);

This comes back with the output 9 after 20 seconds runtime, telling me that the precision actually required by the values is 9 at max. I’ll create a second table for comparison with only the required scale:

SELECT CAST(1.123456789 AS DECIMAL(10,9)) AS testcol

So does it really matter? Is there a runtime difference for analytical queries?

SELECT COUNT(*),MAX(testcol) FROM t1; -- 16 secs runtime
SELECT COUNT(*),MAX(testcol) FROM t2; -- 7 secs runtime

My little experiment shows, the query running on the column with appropriate scale is twice as fast than the one running on the too large scaled column!

It would be beneficial to adjust the column precision according to the scale the stored values actually need, in other words. With statements like this:

ALTER TABLE t1 MODIFY (testcol DECIMAL(10,9));

After that change, the runtime goes down to 7 seconds as well for the first statement.

I was curious if that effect shows also on other databases, so I prepared a similar test case for an Oracle database. Same tables but only 100 million rows. It takes just too long to export tables with 1000 million rows to Oracle, using VMs on my notebook. And don’t even think about trying to generate 1000 million row tables on Oracle with the CONNECT BY LEVEL method, that will just take forever – or more likely break with an out-of-memory error.

The effect shows also with 100 million row tables on Oracle: 5 seconds runtime with too large precision and about 3 seconds with the appropriately scaled column.

Conclusion: Yes, looks like it’s indeed sensible to format table columns according to the actual requirements of the values stored in them and it makes a difference, performancewise.

Categories: DBA Blogs

New free online course: #Exasol Virtual Schemas

Thu, 2021-01-28 04:05

Virtual Schemas integrate foreign data sources into the Exasol database. They enable Exasol to become the central source of truth in your data warehouse landscape.

We added another free online learning course to our curriculum that explains how to deal with Virtual Schemas. Like the others, it comes with many hands-on practices to support a good learning experience. It also contains many demo videos like this one:

I recorded this clip (like most of the others) myself, but this time we decided to do a voice-over by native professional speakers.

Certification exams are free, also for this new course. When you complete the majority of hands-on labs in the course, you get one free certification exam granted per person and per course.

What are you waiting for? Come and get it!

Categories: DBA Blogs

Eleven Table Tennis: Basics

Sat, 2021-01-23 08:05

Assuming you are an IRL player who wants to get as close to the real thing as possible, that’s what I’d recommend:

Make sure you have enough space to play

The green box is your playing space. It should be a square of 2.50 m X 2.50 m ideally. Make sure to leave some space at the front, so you can reach balls close to the net and even a little across the net. Otherwise you may become a victim of ghost serves. Leave enough room at the sides – some opponents play angled, just like IRL.

If you don’t have enough space for this setup – maybe you shouldn’t play multiplayer mode then. You can still have fun, playing against the ballmachine or against the AI. Actually, I think it’s worth the money even in that case.

Use the discord channel

The Eleven TT community is on this discord channel:

I recommend you register there and use the same or a similar name as the name you have in the game. For example, I’m Uwe on discord and uwe. in the game (because the name uwe was already taken). This is handy to get advice from more experienced players, also the game developers are there. They are very responsive and keen to improve Eleven TT even more, according to your feedback.

There’s a preview version presently, that has improved tracking functionality. You can just ask the developers there to get you this preview version. I did, and I find it better than the regular version, especially for fast forehand strokes.

Setup your paddle

When you have the Sanlaki paddle adapter (as recommended in the previous post), go to the menu and then to Paddle Settings:

Click on Paddle Position and select the Sanlaki Adapter:

As an IRL player, you may start with an Advanced Paddle Surface:

Se how that works for you. Bounciness translates to the speed of your blade. An OFF ++ blade would be maximum bounciness. Spin is self-explaining. You have no tackiness attribute, though. Throw Coefficient translates to the sponge thickness. The higher that value, the thicker the sponge.


This takes some time to get used to. You need to press the trigger on the left controller to first “produce” a ball, then you throw it up and press the trigger again to release the ball. Took me a while to practice that and still sometimes I fail to release the ball as smoothly as I would like to.

What I like very much: You have a built-in arbiter, who makes sure your serve is legal according to the ITTF rules. That is applied for matches in multiplayer mode as well as for matches in single player mode. But not in free hit mode! Check out the Serve Practice:

It tells you what went wrong in case:

Remove AI Spin Lock

I recommend you practice with the AI opponent in single player mode for a while. It has spin lock on per default, which means it will never produce any side spin. I find that unrealistic. After some practicing against the AI in single player mode, you’re ready for matches in multiplayer mode against other human opponents.

Categories: DBA Blogs

BARC Survey confirms: #Exasol dominates Analytical Database Peers

Thu, 2020-09-24 05:23

Exasol leads the categories Performance, Platform Reliability and Support Quality for Analytical Database products. And we get a 100% recommendation score from the 782 customers in the survey.

So it’s not one of the big names in the industry who comes out on top of this survey. Not Oracle, not Teradata, not Snowflake, not SAP Hana leads in Analytical Databases but Exasol!

Customer quote: “Unbelievable query performance with almost zero administration effort. You just have to experience it yourself. Once you see it for yourself, you won’t want to work with any other database.”




To summarize:

  1. Exasol is the world’s fastest analytical database
  2. Exasol is reliable and easy to maintain
  3. Exasol’s services and attitude towards customers are highly appreciated

Compare that with your legacy platform: It’s time to contact us now!

Categories: DBA Blogs

How to enlarge an #Exasol database by adding a node

Mon, 2019-09-30 08:46

Adding a cluster node will not only increase the available storage capacity but also the total compute power of your cluster. This scale-out is a quite common operation for Exasol customers to do.

My example shows how to change an existing 2+1 cluster into a 3+0 cluster. Before you can enlarge the database with an active node, this node has to be a reserve node first. See here how to add a reserve to a 2+0 cluster. Of course you can add another reserve node to change from 3+0 to 3+1 afterwards. See here if you wonder why you may want to have a reserve node at all.

Initial state – reserve node is present

I start with a 2+1 cluster – 2 active nodes and 1 reserve node:

For later comparison, let’s look at the distribution of rows of one of my tables:

The rows are roughly even distributed across the two active nodes.

Before you continue, it would be a good idea to take a backup on a remote archive volume now – just in case.

Shutdown database before volume modification

A data volume used used by a database cannot be modified while that database is up, so shut it down first:

After going to the Storage branch in EXAoperation, click on the data volume:

Then click on Edit:

Decrease volume redundancy to 1

Change the redundany from 2 to 1, then click Apply:

Why is the redundancy reduced from 2 to 1 here? Let’s try to explain that. Initially, I had 2 active nodes with a volume using redundancy 2:

A and B are master segments while A’ and B’ are mirrored segments. If I could add a node to this volume keeping the existing segments, it would look like this:

Of course this would be a bad idea. The redundancy is reduced to 1 before the new node is added to the volume:

Only distributed master segments with no mirrors at first. Then the redundancy is again increased to 2:

This way, every master segment can be mirrored on a neighbor node. That’s why the redundancy needs to be reduced to 1.

Add new node to volume

After having decreased the volume redundancy to 1, click Edit on the volume detail page again and add n13 as a new master node to the volume and click Apply:

Increase redundancy to 2

Now click Edit again and increase the redudancy to 2:

The state of the volume shows now as RECOVERING – don’t worry, it just means that mirrored segments are now created.

Enlarge the database

Now click on the database link on the EXASolution screen:

Select the Action Enlarge and click Submit:

Enter 1 and click Apply:

The database detail page looks like this now:


Technically, this is a 3+0 cluster now – but the third node doesn’t contain any data yet. If we look at the same table as before, we see that no rows are on the new node:

To change that, a REORGANIZE needs to be done. Either on the database layer, on schema layer or on table layer. Most easy to perform is REORGANIZE DATABASE:

Took me about 10 Minutes on my tiny database. That command re-distributes every table across all cluster nodes and can be time consuming with high data volume. While a table is reorganized, that table is locked against DML. You can monitor the ongoing reorganization by selecting from EXA_DBA_PROFILE_RUNNING in another session.

Final state

Let’s check the distribution of the previous table again:

As you can see above, now there are rows on the added node. Also EXAoperation confirms that the new node is not empty any more:

On a larger database, you would see that the volume usage of the nodes is less than before per node and every node is holding roughly the same amount of data. For failsafety, you could add another reserve node now.

Summary of steps
  1. Add a reserve node (if not yet existing)
  2. Take a backup on a remote archive volume
  3. Shutdown database
  4. Decrease volume redundancy to 1
  5. Add former reserve node as new master node to the volume
  6. Increase redundancy to 2
  7. Enlarge database by 1 active node
  8. Reorganize
  9. Add another reserve node (optionally)
Categories: DBA Blogs

Getting started with Hyper-V on Windows 10

Fri, 2019-08-30 03:27

Microsoft Windows 10 comes with its own virtualization software called Hyper-V. Not for the Windows 10 Home edition, though.

Check if you fulfill the requirements by opening a CMD shell and typing in systeminfo:

The below part of the output from systeminfo should look like this:

If you see No there instead, you need to enable virtualization in your BIOS settings.

Next you go to Programms and Features and click on Turn Windows features on or off:

You need Administrator rights for that. Then tick the checkbox for Hyper-V:

That requires a restart at the end:

Afterwards you can use the Hyper-V Manager:

Hyper-V can do similar things than VMware or VirtualBox. It doesn’t play well together with VirtualBox in my experience, though: VirtualBox VMs refused to start with errors like “VT-x is not available” after I installed Hyper-V. I also found it a bit trickier to handle than VirtualBox, but that’s maybe just because of me being less familiar with it.

The reason I use it now is because one of our customers who wants to do an Exasol Administration training cannot use VirtualBox – but Hyper-V is okay for them. And now it looks like that’s also an option. My testing so far shows that our educational cluster installation and management labs work also with Hyper-V.

Categories: DBA Blogs

Using DbVisualizer to work with #Oracle, #PostgreSQL and #Exasol

Tue, 2019-07-02 09:01

As a Database Developer or Database Administrator, it becomes increasingly unlikely that you will work with only one platform.

It’s quite useful to have one single tool to handle multiple different database platforms. And that’s exactly the ambition of DbVisualizer.

As a hypothecial scenario, let’s assume you are a database admin who works on a project to migrate from Oracle to EDB Postgres and Exasol.

The goal might be to replace the corporate Oracle database landscape, moving the OLTP part to EDB Postgres and the DWH / Analytics part to Exasol.

Instead of having to switch constantly between say SQL Developer, psql and EXAplus, a more efficient approach would be using DbVisualizer for all three.

I created one connection for each of the three databases here for my demo:Now let’s see if statements I do in Oracle also work in EDB Postgres and in Exasol:




Works the same for all three! The convenient thing here is that I just had to select the Database Connection from the pull down menu while leaving the statement as it is. No need to copy & paste even.

What about schemas and tables?


In EDB, I need to create a schema accordingly:



In Exasol, schema and table can be created in the same way:


Notice that the data types got silently translated into the proper Exasol data types:


There is no DBA_TABLES in Exasol, though:


Of course, there’s much more to check and test upon migration, but I think you got an idea how a universal SQL Client like DbVisualizer might help for such purposes.


Categories: DBA Blogs

Comparison between #Oracle and #Exasol

Fri, 2019-03-08 04:41

After having worked with both databases for quite some time, this is what I consider to be the key differences between Oracle and Exasol. Of course the two have much in common: Both are relational databases with a transaction management system that supports the ACID model and both follow the ANSI SQL standard – both with some enhancements. Coming from Oracle as I do, much in Exasol looks quite familiar. But let’s focus on the differences:


Oracle is leading technology for Online Transaction Processing (OLTP). If you have a high data volume with many users doing concurrent changes, this is where Oracle shines particularly.

Exasol is leading technology for analytical workloads. If you want to do real-time ad hoc reporting on high data volume, this is where Exasol shines particularly.

Architecture Data Format & In-Memory processing

Oracle uses a row-oriented data format, which is well suited for OLTP but not so much for analytical workloads. That’s why Hybrid Columnar Compression (only available on Engineered Systems respectively on Oracle proprietary storage) and the In-Memory Column Store (extra charged option) have been added in recent years.

Exasol uses natively a compressed columnar data format and processes this format in memory. That is very good for analytical queries but bad for OLTP because one session that does DML on a table locks that table against DML from other sessions. Read Consistent SELECT is possible for these other sessions, though.

Oracle was designed for OLTP at times when memory was scarce and expensive. Exasol was designed to process analytical workloads in memory.


Oracle started as a non-clustered (single instance) system. Real Application Clusters (RAC) have been added much later. The majority of Oracle installations is still non-clustered. RAC (extra charged option) is rather an exception than the rule. Most RAC installations are 2-node clusters with availability as the prime reason, scalability being rather a side aspect.

Exasol was designed from the start to run on clustered commodity Intel servers. Prime reasons were MPP performance and scalability with availability being rather a side aspect.

Data Distribution

This doesn’t matter for most Oracle installations, only for RAC. Here, Oracle uses a shared disk architecture while Exasol uses a shared nothing architecture, which is optimal for performance because every Exasol cluster node can operate on a different part of the data in parallel. Drawback is that after adding nodes to an Exasol cluster, the data has to be re-distributed.

With Exadata, Oracle tries to compensate the performance disadvantage of the shared disk architecture by enabling the storage servers to filter data locally for analytical workloads. This approach leads to better performance than Oracle can deliver on other (non-proprietary) platforms.

Availability & Recoverability

Clearly, Oracle is better in this area. A non-clustered Oracle database running in archive log mode will enable you to recover every single committed transaction you did since you took the last backup. With Exasol, you can only restore the last backup and all changes since then are lost. You can safeguard an Oracle database against site failure with a standby database at large distance without performance impact. Exasol doesn’t have that. With RAC, you can protect an Oracle database against node failure. The database stays up (the Global Resource Directory is frozen for a couple of seconds, though) upon node failure with no data loss.

If an Exasol cluster node fails, this leads to a database restart. Means no availability for a couple of seconds and all sessions get disconnected. But also no data loss. Optionally, Exasol can be configured as Synchronous Dual Data Center – similar to Oracle’s Extended RAC.

Complexity & Manageability

I realized that there’s a big difference between Exasol and Oracle in this area when I was teaching an Exasol Admin class recently: Some seasoned Oracle DBAs in the audience kept asking questions like “We can do this and that in Oracle, how does that work with Exasol?” (e.g. creating Materialized Views or Bitmap Indexes or an extra Keep Cache) and my answer was always like “We don’t need that with Exasol to get good performance”.

Let’s face it, an Oracle database is probably one of the most complex commercial software products ever developed. You need years of experience to administer an Oracle database with confidence. See this recent Oracle Database Administration manual to get an impression. It has 1690 pages! And that’s not yet Real Application Clusters, which is additionally 492 pages. Over 2100 pages of documentation to dig through, and after having worked with Oracle for over 20 years, I can proudly say that I actually know most of it.

In comparison, Exasol is very easy to use and to manage, because the system takes care of itself largely. Which is why our Admin class can have a duration of only two days and attendees feel empowered to manage Exasol afterwards.

That was intentionally so from the start: Exasol customers are not supposed to study the database for years (or pay someone who did) in order to get great performance. Oracle realized that being complex and difficult to manage is an obstacle and came out with the Autonomous Database – but that is only available in the proprietary Oracle Cloud.


Using comparable hardware and processing the same (analytical) workload, Exasol outperforms any competitor. That includes Oracle on Exadata. Our Presales consultants regard Exadata as a sitting duck, waiting to get shot on a POC. I was personally shocked to learn that, after drinking the Oracle Kool-Aid myself for years.

In my opinion, these two points are most important: Exasol is faster and at the same time much easier to manage! I mean anything useless could be easy to manage, so that’s not an asset on its own. But together with delivering striking performance, that’s really a big deal.


This is and has always been a painpoint for Oracle customers: The licensing of an Oracle database is so complex and fine granular that you always wonder “Am I allowed to do this without violating my license? Do we really need these features that we paid for? Are we safe if Oracle does a License Audit?” With Exasol, all features are always included and the two most popular license types are totally easy to understand: You pay either for the data volume loaded into the cluster or for the amount of memory assigned to the database. No sleepless nights because of that!


This topic becomes increasingly important as many of our new customers want to deploy Exasol in the cloud. And you may have noticed that Oracle pushes going cloud seriously over the last years.

Exasol runs with all features enabled in the cloud: You can choose between Amazon Web Services, (AWS), Microsoft Azure and ExaCloud


This is presently the most popular way our customers run Exasol in the cloud. See here for more details.

MS Azure

Microsoft’s cloud can also be used to run Exasol, which gives you the option to choose between two major public cloud platforms. See here for more details.


Hosted and managed by Exasol, ExaCloud is a full database-as-a-service offering. See here for more details.

Hybrid Exasol deployments that combine cloud with on-prem can also be used, just depending on customer requirements.

Oracle offers RAC only on the Oracle Cloud platform, not on public clouds. Various other features are also restricted to be available only in Oracle’s own cloud. The licensing model has been tweaked to favor the usage of Oracle’s own cloud over other public clouds generally.

Customer Experience

Customers love Exasol, as the recent Dresner report confirms. We get a perfect recommendation score. I can also tell that from personal encounters: Literally every customer I met is pleased with our product and our services!


Oracle is great for OLTP and okay for analytical workloads – especially if you pay extra for things like Partitioning, RAC, In-Memory Column Store and Exadata. Then the performance you get for your analytical workload might suit your present demand.

Exasol is totally bad for OLTP but best in the world for analytical workloads. Do you think your data volume and your analytic demands will grow?

Categories: DBA Blogs

Recover dropped tables with Virtual Access Restore in #Exasol

Fri, 2019-02-01 04:34

The technique to recover only certain objects from an ordinary backup is called Virtual Access Restore. Means you create a database from backup that contains only the minimum elements needed to access the objects you request. This database is then removed afterwards.

Let’s see an example. This is my initial setup:

EXAoperation Database page

One database in a 2+1 cluster. Yes it’s tiny because it lives on my notebook in VirtualBox. See here how you can get that too.

It uses the data volume v0000 and I took a backup into the archive volume v0002 already.

EXAoperation volumes

I have a schema named RETAIL there with the table SALES:


By mistake, that table gets dropped:

drop table

And I’m on AUTOCOMMIT, otherwise this could be rolled back in Exasol. Virtual Access Restore to the rescue!

First I need another data volume:

second data volume

Notice the size of the new volume: It is smaller than the overall size of the backup respectively the size of the “production database”! I did that to prove that space is not much of a concern here.

Then I add a second database to the cluster that uses that volume. The connection port (8564) must be different from the port used by the first database and the DB RAM in total must not exceed the licensed size, which is limited to 4 GB RAM in my case:

second database

I did not start that database because for the restore procedure it has to be down anyway. Clicking on the DB Name and then on the Backups button gets me here:

Foreign database backups

No backup shown yet because I didn’t take any backups with exa_db2. Clicking on Show foreign database backups:

Backup choice

The Expiration date must be empty for a Virtual Access Restore, so I just remove it and click Apply. Then I select the Restore Type as Virtual Access and click Restore:

Virtual Access Restore

This will automatically start the second database:

Two databases in one cluster

I connect to exa_db2 with EXAplus, where the Schema Browser gives me the DDL for the table SALES:

ExaPlus Schema Browser get DDL

I take that to exa_db1 and run it there, which gives me the table back but empty. Next I create a connection from exa_db1 to exa_db2 and import the table

create connection exa_db2 
to '' 
user 'sys' identified by 'exasol';

import into retail.sales 
from exa at exa_db2 
table retail.sales;

This took about 2 Minutes:


The second database and then the second data volume can now be dropped. Problem solved!


Categories: DBA Blogs

Understanding Partitioning in #Exasol

Wed, 2018-12-12 08:40

Exasol introduced Partitioning in version 6.1. This feature helps to improve the performance of statements accessing large tables. As an example, let’s take these two tables:

Say t2 is too large to fit in memory and may get partitioned therefore.

In contrast to distribution, partitioning should be done on columns that are used for filtering:


Now without taking distribution into account (on a one-node cluster), the table t2 looks like this:

Notice that partitioning changes the way the table is physically ordered on disk.

A statement like

SELECT * FROM t2 WHERE WhereCol=’A’;

would have to load only the red part of the table into memory. This may show benefits on a one-node cluster as well as on multi-node clusters. On a multi-node cluster, a large table like t2 is distributed across the active nodes. It can additionally be partitioned also. Should the two tables reside on a three-node cluster with distribution on the JoinCol columns and the table t2 partitioned on the WhereCol column, they look like this:

That way, each node has to load a smaller portion of the table into memory if statements are executed that filter on the WhereCol column while joins on the JoinCol column are still local joins.

EXA_(USER|ALL|DBA)_TABLES shows both the distribution key and the partition key if any.

Notice that Exasol will automatically create an appropriate number of partitions – you don’t have to specify that.

Categories: DBA Blogs

Accelerate your #BI Performance with #Exasol

Tue, 2018-12-11 02:15

Your BI users complain about slow performance of their analytical queries? Is this your Status Quo?

tableau was taken as a popular example for AdHoc analytics but it might be any of the others like MicroStrategy, Looker, you name it. The good news is that this problem can be solved quite easily and without having to spend a fortune trying to speed up your legacy DWH to keep up with the BI demands:

Exasol High Performance Sidecar

Using Exasol as a High Performance Sidecar to take away the pain from your BI users is the easy and fast cure for your problem! This is actually the most common way how Exasol arrives at companies. More often than not this may lead to a complete replacement of the legacy DWH by Exasol:

Exasol replaces legacy DWH

That’s what adidas, Otto and Zalando did, to name a few of our customers.

Don’t take our word for it, we are more than happy to do a PoC!

Categories: DBA Blogs

Understanding Distribution in #Exasol

Thu, 2018-10-04 04:12
Exasol doesn’t need much administration but getting distribution right matters

Exasol uses a clustered shared-nothing architecture with many sophisticated internal mechanisms to deliver outstanding performance without requiring much administration. Getting the distribution of rows between cluster nodes right is one of the few critical tasks left, though. To explain this, let’s say we have two tables t1 and t2:

The two tables are joined on the column JoinCol, while WHERE conditions for filtering are done with the column WhereCol. Other columns are not shown to keep the sketches small and simple. Now say these two tables are stored on a three-node cluster. Again, for simplicity only active nodes are on the sketch – no reserve nodes or license nodes. We also ignore the fact that small tables will be replicated across all active nodes.

Distribution will be random if no distribution key is specified

Without specifying a distribution key, the rows of the tables are distributed randomly across the nodes like this:

Absence of proper distribution keys: global joins

The two tables are then joined:

SELECT <something> FROM t1 JOIN t2 ON t1.JoinCol = t2.JoinCol;

Internally, this is processed as a global join which means network communication between the nodes on behalf of the join is required. This is the case because some rows do not find local join partners on the same node:

Distribution on join columns: local joins

If the two tables were distributed on their join columns with statements like these



then the same query can be processed internally as a local join:

Here every row finds a local join partner on the same node so no network communication between the nodes on behalf of the join is required. The performance with this local join is much better than with the global join although it’s the same statement as before.

Why you shouldn’t distribute on WHERE-columns

While it’s generally a good idea to distribute on JOIN-columns, it’s by contrast a bad idea to distribute on columns that are used for filtering with WHERE conditions. If both tables would have been distributed on the WhereCol columns, it would look like this:

This distribution is actually worse than the initial random distribution! Not only does this cause global joins between the two tables as already explained, statements like e.g.

<Any DQL or DML> WHERE t2.WhereCol='A';

will utilize only one node (the first with this WHERE condition) and that effectively disables one of Exasol’s best strengths, the Massive Parallel Processing (MPP) functionality. This distribution leads to poor performance because all other nodes in the cluster have to stand by being idle while one node has to do all the work alone.

Examine existing distribution with iproc()

The function iproc() helps investigating the existing distribution of rows across cluster nodes. This statement shows the distribution of the table t1:

Evaluate the effect of distribution keys with value2proc()

The function value2proc() can be used to display the effect that a (new) distribution key would have:

SELECT home_node,COUNT(*) FROM (SELECT value2proc(JoinCol) AS home_node FROM t1) GROUP BY 1 ORDER BY 1;

Distribution on JOIN-columns leads to local joins which perform better than global joins: Do that!

Distribution on WHERE-columns leads to global joins and disables the MPP functionality, both causing poor performance: Don’t do that!

Categories: DBA Blogs

Using the Query Cache for good performance in #Exasol

Mon, 2018-10-01 08:52

The result of a query can be cached in Exasol to the effect that repeated identical queries complete in no time. This feature has been introduced in version 5 and is enabled by default.

SQL_EXA> select session_value,system_value  from exa_parameters where parameter_name='QUERY_CACHE';
EXA: select session_value,system_value  from exa_parameters where parameter...

-------------------- --------------------
ON                   ON

1 row in resultset.

The Query Cache can be (de-)activated on the session level as well as on the system level.

SQL_EXA> alter session set query_cache='off';
EXA: alter session set query_cache='off';

Rows affected: 0
SQL_EXA> select object_name,mem_object_size/1024/1024 as mb from exa_user_object_sizes where object_name='T';
EXA: select object_name,mem_object_size/1024/1024 as mb from exa_user_objec...

-------------------- ----------------------------------
T                             1537.49641990661621093750

1 row in resultset.

SQL_EXA> select count(*) from t;
EXA: select count(*) from t;


1 row in resultset.
SQL_EXA> set timing on;
SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 4
Elapsed: 00:00:03.022

SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 5
Elapsed: 00:00:02.620

SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 6
Elapsed: 00:00:02.724

Without using the Query Cache the repeated query takes roughly 3 seconds.

SQL_EXA> alter session set query_cache='on';
EXA: alter session set query_cache='on';

Rows affected: 0

Timing element: 7
Elapsed: 00:00:00.008

SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 8
Elapsed: 00:00:00.009

Lightning fast! If statement profiling is enabled, QUERY CACHE RESULT shows as PART_NAME in tables like EXA_USER_PROFILE_LAST_DAY.
Also EXECUTION_MODE from EXA_SQL_LAST_DAY shows the usage of the Query Cache:

SQL_EXA> set timing off;
SQL_EXA> select current_statement;
EXA: select current_statement;


1 row in resultset.

SQL_EXA> show autocommit;
SQL_EXA> select command_name,duration,row_count,execution_mode from exa_sql_last_day where stmt_id=33 and session_id=current_session;
EXA: select command_name,duration,row_count,execution_mode from exa_sql_las...

COMMAND_NAME                             DURATION    ROW_COUNT             EXECUTION_MODE
---------------------------------------- ----------- --------------------- --------------------
SELECT                                         0.005                     1 CACHED

1 row in resultset.

If DML changes the table, the result in the Query Cache is invalidated automatically:

SQL_EXA> update t set numcol2=1 where rowid in (select rowid from t limit 1);
EXA: update t set numcol2=1 where rowid in (select rowid from t limit 1);
Rows affected: 1

SQL_EXA> set timing on;
SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 10
Elapsed: 00:00:02.870

SQL_EXA> set timing off;
SQL_EXA> select current_statement;
EXA: select current_statement;


1 row in resultset.

SQL_EXA> select command_name,duration,row_count,execution_mode from exa_sql_last_day where stmt_id=49 and session_id=current_session;
EXA: select command_name,duration,row_count,execution_mode from exa_sql_las...

---------------------------------------- ----------- --------------------- --------------------

0 rows in resultset.

There’s a 1 Minute interval for syncing the statistic tables. It can be triggered manually, though:

SQL_EXA> flush statistics;
EXA: flush statistics;

Rows affected: 0

SQL_EXA> commit;
EXA: commit;

Rows affected: 0

SQL_EXA> select command_name,duration,row_count,execution_mode from exa_sql_last_day where stmt_id=49 and session_id=current_session;
EXA: select command_name,duration,row_count,execution_mode from exa_sql_las...

COMMAND_NAME                             DURATION    ROW_COUNT             EXECUTION_MODE
---------------------------------------- ----------- --------------------- --------------------
SELECT                                         2.862                     1 EXECUTE

1 row in resultset.

Runtime and EXECUTION_MODE EXECUTE confirms that the Query Cache was invalidated by the UPDATE above. Now it’s automatically refreshed:

SQL_EXA> set timing on;
SQL_EXA> select count(*) from t where numcol2=42;
EXA: select count(*) from t where numcol2=42;


1 row in resultset.

Timing element: 11
Elapsed: 00:00:00.010
Categories: DBA Blogs

#Exasol Database whoami

Wed, 2018-09-19 02:03

This little script displays some useful meta-information:

SQL_EXA> create schema myschema;
EXA: create schema myschema;

Rows affected: 0

SQL_EXA> create or replace script whoami as
 output('Current User: '.. tostring(exa.meta.current_user))
 output('Current Schema: '.. tostring(exa.meta.current_schema))
 output('Session ID: '.. tostring(exa.meta.session_id))
 output('Database Version: '.. tostring(exa.meta.database_version))
 output('Number of Nodes: '.. tostring(exa.meta.node_count))
EXA:create or replace script whoami as...

Rows affected: 0
SQL_EXA> col output for a40;
COLUMN   output ON
FORMAT   a40
SQL_EXA> execute script whoami with output;
EXA: execute script whoami with output;

Current User: SYS
Current Schema: MYSCHEMA
Session ID: 1612024483893367379
Database Version: 6.1.0-alpha1
Number of Nodes: 1

5 rows in resultset.

All available metadata is documented here (Chapter 3. Concepts -> 3.6 UDF scripts).

Categories: DBA Blogs

Scripts for Batch-Processing using the Data Dictionary in #Exasol

Mon, 2018-08-13 05:59

If you want to batch-process over a number of objects in Exasol, scripts that work with the Data Dictionary might do the trick. Let’s say I want to drop all of my tables that start with the letter P:


SQL_EXA> open schema fred;
EXA: open schema fred;

Rows affected: 0

SQL_EXA> create or replace script droptables
          res=query([[select * from exa_user_tables where table_name like 'P%']])
          for i=1, #res
           query([[drop table ::t]],{t=res[i].TABLE_NAME})
EXA: create or replace script droptables...

Rows affected: 0

SQL_EXA> select table_name from exa_user_tables;
EXA: select table_name from exa_user_tables;


2 rows in resultset.

Amazingly enough, DROP TABLE is an operation that can be rolled back in Exasol. So to safeguard, I can disable autocommit here.

SQL_EXA> set autocommit off;
SQL_EXA> execute script droptables;
EXA: execute script droptables;

Rows affected: 0

SQL_EXA> select table_name from exa_user_tables;
EXA: select table_name from exa_user_tables;


0 rows in resultset.

SQL_EXA> rollback;
EXA: rollback;

Rows affected: 0

SQL_EXA> select table_name from exa_user_tables;
EXA: select table_name from exa_user_tables;


2 rows in resultset.
Categories: DBA Blogs

#Exasol Cluster Architecture

Mon, 2018-07-16 11:57

This article gives a more detailed view on the Exasol Cluster Architecture. A high level view is provided here.

Exasol Cluster Nodes: Hardware

An Exasol Cluster is built with commodity Intel servers without any particular expensive components. SAS hard drives and Ethernet Cards are sufficient. Especially there is no need for an additional storage layer like a SAN.

See here for a list of Exasol Certified Servers.

Disk layout

As a best practice the hard drives of Exasol Cluster nodes are configured as RAID 1 pairs. Each cluster node holds four different areas on disk:

1.OS with 50 GB size containing CentOS Linux, EXAClusterOS and the Exasol database executables

2.Swap with 4 GB size

3.Data with 50 GB size containing Logfiles, Coredumps and BucketFS

4.Storage consuming the remaining capacity for the hard drives for the Data Volumes and Archive Volumes

The first three areas can be stored on dedicated disks in which case these disks are also configured in RAID 1 pairs, usually with a smaller size than those that contain the volumes. More common than having dedicated disks is having servers with only one type of disk. These are configured as hardware RAID 1 pairs. On top of that software RAID 0 partitions are being striped across all disks to contain OS, Swap and Data partition.

Exasol 4+1 Cluster: Software Layers

This popular multi-node cluster serves as example to illustrate the concepts explained. It is called 4+1 cluster because it has 4 Active nodes and 1 Reserve node. Active and Reserve nodes have the same layers of software available. The purpose of the Reserve node is explained here. Upon cluster installation, the License Server copies these layers as tar-balls across the private network to the other nodes. The License Server is the only node in the cluster that boots from disk. Upon cluster startup, it provides the required SW layers to the other cluster nodes.

Exasol License Essentials

There are three types of licenses available:

Database RAM License: This most commonly used model specifies the total amount of RAM that can be assigned to databases in the cluster.

Raw Data License: Specifies the maximum size of the raw data you can store across databases in the cluster.

Memory Data License: Specifies the maximum size of the compressed data you can store across all databases.

For licenses based on RAM, Exasol checks the RAM assignment at the start of the database. If the RAM in use exceeds the maximum RAM specified by the license, the database will not start.

For licenses based on data size (raw data license and memory data license), a periodic check is done by Exasol on the size of the data. If the size limit exceeds the value specified in the license, the database does not permit any further data insertion until the usage drops below the specified value.

Customers receive their license as a separate file. To activate the license, these license files are uploaded to the License Server using EXAoperation.

EXAStorage volumes

Storage Volumes are created with EXAoperation on specified nodes.

EXAStorage provides two kinds of volumes:

Data volumes:

Each database needs one volume for persistent data and one temporary volume for temporary data.

While the temporary volume is automatically created by a database process, the persistent data volume has to be created by an Exasol Administrator upon database creation.

Archive volumes:

Archive volumes are used to store backup files of an Exasol database.

Exasol 4+1 Cluster: Data & Archive Volume distribution

Data Volumes and Archive Volumes are hosted on  the hard drives of the active nodes of a cluster.

They consume the major capacity of these drives. The license server usually hosts EXAoperation.

EXAoperation Essentials

EXAoperation is the major management GUI for Exasol Clusters, consisting of an Application Server and a small Configuration Database, both located on the License Server under normal circumstances. EXAoperation can be accessed from all Cluster Nodes via HTTPS. Should the License Server go down, EXAoperation will failover to another node while the availability of the Exasol database is not affected at all.

Shared-nothing architecture (MPP processing)

Exasol was developed as a parallel system and is constructed according to the shared-nothing principle. Data is distributed across all nodes in a cluster. When responding to queries, all nodes co-operate and special parallel algorithms ensure that most data is processed locally in each individual node’s main memory.

When a query is sent to the system, it is first accepted by the node the client is connected to. The query is then distributed to all nodes. Intelligent algorithms optimize the query, determine the best plan of action and generate needed indexes on the fly. The system then processes the partial results based the local datasets. This processing paradigm is also known as SPMD (single program multiple data). All cluster nodes operate on an equal basis, there is no Master Node. The global query result is delivered back to the user through the original connection.

Above picture shows a Cluster with 4 data nodes and one reserve node. The license server is the only server that boots from disk. It provides the OS used by the other nodes over the network.

Exasol uses a shared nothing architecture. The data stored in this database is symbolized with A,B,C,D to indicate that each node contains a different part of the database data. The active nodes n11-n14 each host database instances that operate on their part of the database locally in an MPP way. These instances communicate and coordinate over the private network.

Exasol Network Essentials

Each Cluster node needs at least two network connections: One for the Public Network and one for the Private Network. The Public Network is used for client connections. 1 Gb Ethernet is sufficient usually. The Private Network is used for the Cluster Interconnect of the nodes. 10 GB Ethernet or higher is recommended for the Private Network. Optionally, the Private Network can be separated into one Database Network (Database Instances communicate over it) and one Storage Network (Mirrored Segments are synchronized over this network).

Exasol Redundancy Essentials

Redundancy is an attribute that can be set upon EXAStorage Volume creation. It specifies the number of copies of the data that is hosted on Active Cluster nodes. In practice this is either Redundancy 1 or Redundancy 2. Redundancy 1 means there is no redundancy, so if one node fails, the volume with that redundancy is no longer available. Typically that is only seen with one-node Clusters. Redundancy 2 means that each node holds a copy of data that is operated on by a neighbor node, so the volume remains available if one node fails.

Exasol 4+1 Cluster: Redundancy 2

If volumes are configured with redundancy 2 – which is a best practice – then each node holds a mirror of data that is operated on by a neighbor node. If e.g. n11 modifies A the mirror A‘ on n12 is synchronized over the private network. Should an active node fail, the reserve node will step in starting an instance.


Categories: DBA Blogs

#Exasol Fail-Safety explained

Thu, 2018-07-05 10:26

The building blocks of an Exasol cluster are commodity Intel servers like e.g. Dell PowerEdge R740 with 96 GB RAM,12 x 1.2 TB SAS Hot-plug hard-drives and 2 x 10Gb Ethernet Cards for the private network. That’s sufficient to deliver outstanding performance combined with high availability. The picture below shows a 4+1 cluster, one of our most popular configurations:

Exasol 4+1 Cluster

Exasol 4+1 Cluster: Shared Nothing Architecture

Each active node hosts one database instance that works on its part of the database (A,B,C,D) in an MPP way. The instances communicate over the private network. Optionally, the private network can be separated into one database network and one storage network. In this case, the instances communicate over the database network. Notice that the instances access their part of the database directly on their local hard drives, they do not need the private network respectively the storage network for that. The reserve node becomes relevant only if one of the active nodes fails. The local hard drives are being setup in RAID 1 pairs, so single disk failures can be tolerated without losing database availability. Not listed is the license node that is required to boot the cluster initially. After that, the license node is no longer required to keep the cluster running.

If data volumes with redundancy 2 are in use – which is the most common case – then each node holds a copy of the data operated on by a neighbor node:

 Redundancy 2

Exasol 4+1 Cluster: Redundancy 2

If a Master-Segment like A is modified, the Slave-Segment (A’) is synchronized accordingly over the private network respectively the storage network.

Availability comes with a price: The raw disk capacity is reduced by half because of the RAID 1 mirroring and again by half because of the redundancy 2, so you remain with approximately (Linux OS and database software also require a small amount of disk space) 1/4 of your raw disk capacity. But since we are running on commodity hardware – no storage servers, no SAN, no SSDs required etc. – this is actually a very competitive price.

Now what if one node fails?

 Node failure

Exasol 4+1 Cluster: Node failure

ExaClusterOS – Exasols Clusterware – will detect the node failure within seconds and shutdown all remaining database instances in order to preserve a consistent state of the database. Then it restarts them again on the still available 3 nodes and also on the Reserve node that now becomes an Active node too. The database itself becomes available again with the node n15 now immediately working with segment B’.

The downtime of the system caused by the node failure is below 30 seconds typically. The restart of the database triggers a threshold called Restore Delay which defaults to 10 Minutes. If within that time the failed node becomes available again, we will just re-synchronize the segments (A’ and B in the example) which can be done fast. The instance on n15 will then work with the segment B as a Master-Segment until the cluster is manually restarted. Then n15 becomes a reserve node again and n12 is active with an instance running there.

If the failed node doesn’t come back within Restore Delay:

 Restore Delay is over

Exasol 4+1 Cluster: Restore Delay is over

We will then create new segments on node n15: A’ is copied from n11 and B is copied from n13. This activity is time-consuming and puts a significant load on the private network, which is why configuring a dedicated storage network may be beneficial to avoid a drop in performance during that period. A new reserve node should now be added to the cluster, replacing the crashed n12.

Categories: DBA Blogs

Users, schemas & privileges in #Exasol

Wed, 2018-05-23 09:18

Exasol Logo

In Exasol, a database user may own multiple schemas – or even none at all. I connect to my Community Edition to show that:

C:\Users\uh>cd \Program Files (x86)\EXASOL\EXASolution-6.0\EXAplus

C:\Program Files (x86)\EXASOL\EXASolution-6.0\EXAplus>exaplusx64 -c -u sys -p exasol -lang EN

EXAplus 6.0.8 (c) EXASOL AG

Wednesday, May 23, 2018 3:28:29 PM CEST
Connected to database EXAone as user sys.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create user adam identified by adam;
EXA: create user adam identified by adam;

Rows affected: 0

SQL_EXA> grant dba to adam;
EXA: grant dba to adam;

Rows affected: 0

SQL_EXA> select user_name from exa_dba_users;
EXA: select user_name from exa_dba_users;


2 rows in resultset.

SQL_EXA> select schema_owner,schema_name from exa_schemas;
EXA: select schema_owner,schema_name from exa_schemas;


1 row in resultset.

SQL_EXA> connect adam/ADAM;

Wednesday, May 23, 2018 3:34:42 PM CEST
Connected to database EXAone as user adam.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create table t1 (n number);
EXA: create table t1 (n number);
Error: [42000] no schema specified or opened or current schema has been dropped [line 1, column 27] (Session: 1601269589413551548)
SQL_EXA> open schema adam;
EXA: open schema adam;
Error: [42000] schema ADAM not found [line 1, column 13] (Session: 1601269589413551548)

Demo user adam has the DBA role granted but there is no adam schema yet. I need to create it first:

EXA: create schema adam;

Rows affected: 0

SQL_EXA> open schema adam;
EXA: open schema adam;

Rows affected: 0

SQL_EXA> create table t1 (n number);
EXA: create table t1 (n number);

Rows affected: 0

SQL_EXA> create schema adam2;
EXA: create schema adam2;

Rows affected: 0

SQL_EXA> create table adam2.t2 (n number);
EXA: create table adam2.t2 (n number);

Rows affected: 0

SQL_EXA> select table_schema,table_name from exa_user_tables;
EXA: select table_schema,table_name from exa_user_tables;


2 rows in resultset.

As you see, user adam has now two schemas with different tables in them. Now briefly to privileges:

SQL_EXA> create user fred identified by fred;
EXA: create user fred identified by fred;

Rows affected: 0

SQL_EXA> grant create session to fred;
EXA: grant create session to fred;

Rows affected: 0

SQL_EXA> grant select on adam.t1 to fred;
EXA: grant select on adam.t1 to fred;

Rows affected: 0

SQL_EXA> connect fred/FRED;

Wednesday, May 23, 2018 3:53:34 PM CEST
Connected to database EXAone as user fred.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> select * from adam.t1;
EXA: select * from adam.t1;


0 rows in resultset.

SQL_EXA> select * from adam2.t2;
EXA: select * from adam2.t2;
Error: [42500] insufficient privileges: SELECT on table T2 (Session: 1601270776421928841)
SQL_EXA> connect adam/ADAM;

Wednesday, May 23, 2018 3:54:33 PM CEST
Connected to database EXAone as user adam.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create role allonadam2;
EXA: create role allonadam2;

Rows affected: 0

SQL_EXA> grant all on adam2 to allonadam2;
EXA: grant all on adam2 to allonadam2;

Rows affected: 0

SQL_EXA> grant allonadam2 to fred;
EXA: grant allonadam2 to fred;

Rows affected: 0

SQL_EXA> connect fred/FRED;

Wednesday, May 23, 2018 3:55:54 PM CEST
Connected to database EXAone as user fred.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> select * from adam2.t2;
EXA: select * from adam2.t2;


0 rows in resultset.

SQL_EXA> drop table adam2.t2;
EXA: drop table adam2.t2;
Error: [42500] insufficient privileges for dropping table (Session: 1601270923042332982)

That’s because ALL contains ALTER, DELETE, EXECUTE, INSERT, SELECT and UPDATE but not DROP which can be confirmed using EXA_DBA_OBJ_PRIVS.

Categories: DBA Blogs

How to cancel SQL statements and disconnect sessions in #PostgreSQL

Mon, 2018-01-08 12:29

In PostgreSQL, you can cancel problem statements or terminate offending sessions remotely with PG_CANCEL_BACKEND and PG_TERMINATE_BACKEND. This article shows how you can do it with working examples.

edb=# select current_database,current_user;
 current_database | current_user 
 edb              | enterprisedb
(1 row)

I’m connected as superuser. The demo is done with EDB Postgres Advanced Server 10.1.5 but the shown technique should work the same with other Postgres distributions and older versions. First I create a demo user:

edb=# create role adam password 'adam' login;
edb=# grant connect on database edb to adam;

Now opening another session with that new user:

-bash-4.2$ psql -U adam
Password for user adam: 
psql.bin (10.1.5)
Type "help" for help.

edb=> begin
edb$> loop
edb$> null;
edb$> end loop;
edb$> end;

That session burns CPU now in an endless loop. Back to the superuser session:

edb=# select pid,usename,query
edb-# from pg_catalog.pg_stat_activity where datname='edb';
  pid  |   usename    |                         query                         
 14346 | adam         | begin                                                +
       |              | loop                                                 +
       |              | null;                                                +
       |              | end loop;                                            +
       |              | end;
  5517 | enterprisedb | select pid,usename,query                             +
       |              | from pg_catalog.pg_stat_activity where datname='edb';
(2 rows)
This cancels the SQL statement of one session:
edb=# select pg_cancel_backend(14346);
(1 row)

The session that was doing the endless loop gets this output:

ERROR:  canceling statement due to user request
CONTEXT:  edb-spl function inline_code_block line 3 at NULL
This disconnects a single session:
edb=# select pg_terminate_backend(14346);
(1 row)

The disconnected session gets this output after trying to do anything:

edb=> \d
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
This way you cancel all SQL statements of a certain user:
edb=# select pg_cancel_backend(pid) from pg_stat_activity where usename='adam';
This way you disconnect all sessions of a certain user:
edb=# select pg_terminate_backend(pid) from pg_stat_activity where usename='adam';

In the unlikely event that your end users are connected with psql, it will try to reconnect them. That’s the background of the above listed output “… Attempting reset: Succeeded.” Means the session got reconnected. If you want to prevent that particular user from (re-)connecting, you need to do this additionally:

edb=# revoke connect on database edb from adam;

Remember to also revoke from public if you haven’t done that already, otherwise the above revoke doesn’t show any effect:

edb=# revoke connect on database edb from public;
Now this disconnects all sessions of one user and the above prevents new connects by that user:
edb=# select pg_terminate_backend(pid) from pg_stat_activity where usename='adam';
(1 row)

The terminated session then gets this output:

edb=> \d
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Keep in mind that the revoke is impacting the whole user, not just one particular session. In other words no session with that user can be established subsequently until you say:

edb=# grant connect on database edb to adam;
You can terminate all sessions connected to one particular database – except your own session – like this:
edb=# select pg_terminate_backend(pid)
edb-# from pg_stat_activity
edb-# where datname='edb' and pid<>pg_backend_pid;

Once again, this does not prevent new sessions from connecting. So either you REVOKE CONNECT on the user layer as shown above, or you do it on the database layer.

This is how normal users are prevented from connecting to the database:
edb=# alter database edb with connection limit 0;

Superusers can still connect. The above is the equivalent to ALTER SYSTEM ENABLE RESTRICTED SESSION in Oracle. This is what normal users get now upon trying to connect:

-bash-4.2$ psql -U adam
Password for user adam: 
psql.bin: FATAL:  too many connections for database "edb"
The default of unlimited number of sessions allowed to connect can be set back like this:
edb=# alter database edb with connection limit -1;
This prevents also superusers from connecting to a database, but you must not be connected to that database yourself:
edb=# alter database edb with allow_connections false;
ERROR:  cannot disallow connections for current database

So either you connect to another existing database in the cluster or you create another database temporarily:

edb=# create database dummy;
edb=# \c dummy
You are now connected to database "dummy" as user "enterprisedb".
dummy=# alter database edb with allow_connections false;

Now both normal users and superusers get this output when trying to connect:

psql.bin: FATAL:  database "edb" is not currently accepting connections
The default that connections to that database are allowed can be set back with this command:
dummy=# alter database edb with allow_connections true;
dummy=# \c edb
You are now connected to database "edb" as user "enterprisedb".
edb=# drop database dummy;

What I like especially about the shown functionality is the option to remotely cancel a particular (ongoing) statement without having to terminate the session that runs the statement. I’m not aware of a supported way to do that in Oracle. It can be done if Resource Manager has been configured appropriately beforehand, but that requires quite some effort and doesn’t work just out of the box.

Tagged: PostgreSQL
Categories: DBA Blogs