You can implement materialized views in MySQL. How? Read this post and I will show you 3 different methods for creating a MySQL materialized view.
An application is only as fast as it’s slowest component. If your application uses MySQL your objective is clear:
Improve MySQL query performance to get faster data retrieval and your application will instantly become faster.
In this post I will show you 3 techniques for creating MySQL materialized views and get the best performance for data driven applications.
Note for MariaDB Users
Because MariaDB is actually a fork from MySQL, all the techniques presented will work the same for creating MariaDB materialized views and MySQL. I’ve recently read a good article MariaDB vs MySQL if you are interested in a complete comparison between the two databases. When documenting for this post, I’ve been creating materialized views on MySQL 5.7
You will have to agree
You create database views for convenience and for performance. When view performance is not enough the next step is creating materialized views. Before turning to materialized views, you should first explore all the performance tweaks you can do to normal MySQL views. For example, you could improve a MySQL view performance simply by creating a well thought MySQL (or MariaDB) indexed view.
MySQL is not the fastest relational database, but
In this post we will take a look at a few options for setting up materialized views in MySQL. We’ll also see what are the pros and cons for these methods.
Both approaches will also work fine for creating a materialized view in MariaDB and in MySQL.
You don’t want the user to wait minutes to load a screen showing data, right?
So what do we do?
Let’s see an use case and how to create and use materialized views in MySQL.
I was working on a small dashboarding project on Liferay. The dashboard was supposed to display statistics about an application users data.
The application used internally several hundreds MySQL databases (schemas). Each schema had the same tables and structure, but different data corresponding to a different client. If you want to know more about the details of this read my article on handline multi-tenancy in Liferay.
Running a query over hundreds of databases every time a user goes to the dashboard was not an option.
The dashboard needed data combined from the same table in 200 different databases. The query was not very complex but it took 22 seconds to execute. Too long for loading only one chart.
Enter the materialized view.
So, basically I needed something like a MySQL view with a cache, such that when queried it gives me the data without executing queries over 200 databases.
CREATE MATERIALIZED VIEW Doesn’t Exist In MySQL
But, there is no MySQL syntax for creating materialized views. You cannot say
CREATE MATERIALIZED VIEW `user_stats` AS SELECT * FROM `DB-1`.USERS WHERE ... UNION SELECT * FROM `DB-2`.USERS WHERE .... UNION ...
There are no materialized views in MySQL, but there are ways to achieve the same results in MySQL.
Let’s see how.tweet this method for creating materialized views in mysql
When any data is changed the trigger will make sure to refresh the materialized view.
Here’s what we do
We will create a MySQL table with the needed structure. This table will be the fake “materialized view”.
To do this simply create the table using a query like:
CREATE TABLE `user_stats` AS SELECT * FROM `DB-1`.USERS WHERE ... UNION SELECT * FROM `DB-2`.USERS WHERE .... UNION ...
This will both create the table and fill it in with data. The data is a snapshot of the data in all the 200 tables from our use case.
How To Refresh The Materialized View
How do we keep the data up-to-date?
That is where triggers come in. If you are not familiar with MySQL triggers, in short they are a mechanism through which database events like inserts, updates and deletes are handled.
We need to set triggers on all 3 operations (insert, update and delete) on the source tables in all 200 databases.
The insert trigger should insert a copy of the inserted row in the “materialized view” table.
The update trigger should take the updated row, identify it in the “materialized view” table and do the same update.
The delete trigger should identify and delete the row in the “materialized view”.
Here’s the SQL code:
delimiter | CREATE TRIGGER MATERIALIZED_VIEW_TRG AFTER INSERT ON USERS FOR EACH ROW BEGIN INSERT INTO USER_STATS(id,login_date, ...) values(new.id,login_date,...); END; delimiter ; delimiter | CREATE TRIGGER MATERIALIZED_VIEW_TRG AFTER UPDATE ON USERS FOR EACH ROW BEGIN UPDATE USER_STATS set login_date = new.login_date,... where id = new.id; END; delimiter ; delimiter | CREATE TRIGGER MATERIALIZED_VIEW_TRG BEFORE DELETE ON USERS FOR EACH ROW BEGIN DELETE FROM USER_STATS where id = old.id; END; delimiter ;
The main challenge here is to create 200 times 3 triggers. You can use a smart text editor or write a small piece of code that generates the script for creating the 600 triggers.
Another downside of this approach is that if you need to change anything in the logic of the triggers you need to re-create them.
The upside is that the “materialized view” will hold realtime data.
This method offers the best performance without putting unnecessary stress on the database. It only makes modifications to the data when they are inserted, updated or deleted.
There are some articles out there on materialized view using triggers, but as I will show you my approach.tweet this method for creating materialized views in mysql
MySQL events are actions that you can schedule in MySQL. In our case the solution is to schedule the transfer of data between the source and the “materialized view”.
You can schedule events to run every hour, every day, every week etc.
To read more on the syntax for creating a scheduled event please read the MySQL events syntax documentation.
In MySQL scheduled events are not activated by default.
To activate them do this
SET GLOBAL event_scheduler = ON;
or start your MySQL instance with the flag –event-scheduler=ENABLED.
Create the scheduled event
CREATE EVENT `user_stats_daily` ON SCHEDULE EVERY 1 DAY STARTS '2016-06-29 20:00:00' ON COMPLETION NOT PRESERVE ENABLE COMMENT 'Creates a materialized view with user data' DO BEGIN SET GLOBAL group_concat_max_len=1000000; select GROUP_CONCAT(CONCAT("SELECT NOW() as last_update,'",shard.name,"' as shardname, `name` 'Client name', DATE_FORMAT(IFNULL(`lastLoginDate`,`loginDate`),'%b %d %Y %h:%i %p') 'Last login date', lastLoginIP 'Login IP', active_ 'Active' FROM `DB-", shard.name, "_portal`.`organization_`,`DB-", shard.name, "_portal`.`user_` WHERE `organizationId` in (select min(organizationId) from `DB-",shard.name, "_portal`.`organization_`) ") SEPARATOR ' UNION ALL ') INTO @stmt_sql FROM `portal_db`.`shard` shard; SET @drop_stmt = "drop table if exists DB-APP._b_user_stats;"; PREPARE stmt FROM @drop_stmt; EXECUTE stmt; SET @sql = concat("create table DB-APP._b_user_stats as ",@stmt_sql); PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; END
Let’s analyze what we have here.
First we define the scheduler to run every day at 20:00 (after business hours to avoid slowing the server down).
I used GROUP_CONCAT to dynamically create the SQL query. The query goes in all 200 databases so it is a pretty long string.
That is why we have to do SET GLOBAL group_concat_max_len=1000000; to set the maximum length of the text that GROUP_CONCAT can output.
The ‘shard’ table holds the index of each database. They are numbers 1 through 200, so the SQL will go over all databases from DB-1 to DB-200.
We store the text of the big SQL query in the variable @stmt_sql. Then we delete the “materialized view” to make sure we start fresh, and then create it again using the result of the big SQL query.
You now know another method to create MySQL materialized views.
It’s actually the one I used in production. Having the data daily updated was enough for the dashboard display.
This approach has the upside that it’s much easier to maintain than option 1. There is only one definition of the scheduled event to maintain.
You can easily adjust the frequency of the scheduler.
This method can affect the performance of the database if the scheduler runs too often. I runs the queries regularly even if no data was changed.
As you probably know, MySQL has been bought by Oracle.
When that happened, the project has split into 2 separate projects: Oracle MySQL and MariaDB
The idea behind MariaDB was to keep the opensource aspect of MySQL going.
One of the latest developments (to date) is something called flexviews.
MySQL Real Materialized Views – MySQL Flexviews
Flexviews are temporary tables that store results of a view. There is an API which offers refresh functions. In this way, you can implement real materialized views in MySQL, not just emulations.
How can I implement flexviews you ask?
Project swanhart-tools which is free to download from Github.
flexviews folder is the code needed to implement materialized views on MySQL and MariaDB
The requirements for flexviews are:
- MariaDB or MySQL, version 5.1+
- PHP 5.2+ required, 5.3+ is recommended (for FlexCDC)
- pcntl extension
The tool offers two refresh methods:
INCREMENTALwhich analyzes the binary logs to identitfy the rows that changed and makes the changes in the materialized view table; the advantage is speed, however there are limitations regarding the functions that you can use in the backing SQL query;
COMPLETEwhich basically discards the old data in the materialized view table and puts in the new data; the advantage is that the backing SQL query can use any function available but, it will obviously take longer to process
You now have several methods for creating materialized views in MySQL and MariaDB.
You can also combine approaches by simplifying the triggers to only mark the materialized view as out of sync and then the scheduled event to only run in case the view is marked as out of sync.
If you have questions about the implementation of materialized views please drop a comment below.
So, which method are you going to try? Will you try creating materialized views using MySQL triggers?
If you have used other methods for creating materialized views please share. Also, I’m a user of Amazon web services and I know that Amazon deploys it’s own version of MySQL called Aurora. I am pretty sure that the same techniques will work for creating Aurora MySQL materialized view, but I’ve yet to give that a try.