OpenGauss is an open-source, enterprise-grade relational database management system (RDBMS) reputed for high security, high availability (HA), AI support and easy operation and management (OM). openGauss is released under the Mulan Permissive Software License v2 and is based on postgreSQL.
Features of openGauss
Modern database should have certain fundamental features like high availability, interdependence with AI and high security. openGauss is at the forefront of incorporating these functionalities.
High Availability
A database should be on for virtually every hour of the day throughout the year. OpenGauss provides
for this requirement through support for multi-node deployment;
openGauss support primary/standby deployment which means one node may perform all the database operations
while another picks up on failures i.e failovers (triggered promotion of standby to primary)
and switchovers (switch between primary and standby nodes).
OpenGauss also support one primary and multiple standby deployment providing both switchover and
failover.
Unlike the standalone deployment, these two modes provide a Recovery Point Objective less than 0s
and a Recovery Time Objective less than 10 seconds.
High Security
openGauss ensures security through advanced functionalities like
Equality query in a fully-encrypted databases
This innovation ensure data is protected in transmission, computing and storage. The owner encrypts and sends data to server. This mean attackers are impotent unless they crack the encryption algorithms.
Dynamic data masking
This technique involves the database admin specifying objects to be anonymized and creating a custom masking policy, if the database resources queried by a user are associated with a masking policy, data is anonymized based on the user identity and masking policy. Masking policies can be selected based on site requiremnts or custom masking policies can be created for specific users.
Access Control
Access control is the management of users access to database through permissions. OpenGauss uses the role-based access where permissions are assigned to roles and roles assigned to users. A policy of minimum permissions to users is advised for even better securitty.
Row-level security
This feature of openGauss security ensures access control at the row level i.e different users can perform same read query and and get different result. This happens because users may not have same permissions on the rows.
Database audit
Audit logs are used to record user operations like startup, stopping, connection, DDL, DML and DCL. The details of recorded per operation includes, event time, type, result, username, database, connection information, database instance name etc. These logs are then queried based on start and end time and can be used to determine unauthorized access, unauthorized operations and the time they occurred.
Unified audit
Apart from database audit, a unified audit is supported by openGauss. The database admin defines audit objects and behavior. If a task associated with a policy is executed, the corresponding audit behavior is generated and log entry recorded. The purpose of unified audit is to include specific behaviors in an audit and exclude others thereby simplifying management.
Easy Operation and Maintenance (OM)
A good database system involves easy to use DBMS. In a bid to provide the best interface for use and maintenance, openGauss integrated this functionalities:
Workload Diagnosis Report (WDR)
This is a tool used to generate performance report between two snapshots. The report is used to diagnose database kernel performance faults. WDR is the main method for diagnosing long-term performance problems.
Slow SQL diagnosis
Slow SQL diagnosis enable diagnosis of performance problems of a specific slow SQL statements offline without reproducing the problem. The table-based functions and the function-based functions allow users to collect statistics on slow SQL and connect to platforms. It records information about all jobs whose execution time exceeds the threshold ‘log-min-duration-statement’.
Session performance
Session performance involves diagnosing the performance of all active sessions in the system. Session information can help users diagnose which sessions are consuming more CPU and memory resources, which database objects are hot objects, or which SQL statements consume more key event resources.
AI mutual benefits
openGauss has a unique relation with Artificial Intelligence in that it leverages AI capabilities in its operations and at the same time supports AI development and operations. Based on this symbiosis, openGauss supports;
1. AI4DB
Read as ‘AI for DB’, meaning using AI for the benefit of the database/openGauss. There are many applications of AI to improve database operations:
a) X-Tuner
X-Tuner is a parameter tuning tool integrated into databases. The tool is kernel independent and is not
necessarily deployed with the database. It aims to reduce the workload on database admin (DBA) through
provision of parameter adjustments configuration of the current load in any scenario. It has 3 operation modes:
b) SQLdiag
SQLdiag is a tool that predicts how long SQL statement will execute It is mainly used in OLAP for historical SQL statement execution time.
c) Index-advisor
Is an AI4DB tool that provides a parameter to a data value stored in a specified column of a table. Index recommendation functions includes single query index, virtual index and workload-level index recommendations.
d) Anomaly Detection
Anomaly detection can be used to collect and predict database indicators as well as monitor and diagnose exceptions and provide notification in log files. It consists of an agent and a detector communicating through http/https protocols.
2. DB4AI
Read as ‘DB for AI’ and entails openGauss features which enable development and operations of Artificial Intelligence models.
a) Predictor
Predictor is a query prediction tool that leverages machine learning and online learning capabilities and can predict how long a plan will take to execute based on learning from historical execution.
b) DeepSQL
DeepSQL is built into the database for faster analysis and processing of big data and users can use SQL statement for machine learning. DeepSQL flaunts support for more than 60 general algorithms for Machine Learning.
Minimum Hardware Requirements for openGauss
| Resource | Minimum |
|---|---|
| Memory | 32GB and 128GB, 4GB for developers’ individual use |
| CPU | at least 2-core, 8-core 2.0Hz recommended |
| Disk | 1GB for application, 300MB for data storage |
| Network | 300Mbits/s, 2 NICs for redundancy (Production) |
Installation process of openGauss (Single-node)
- Prepare for installation
- Obtain installation package
- Configure XML file
- Upload installation package + XML file
- Decompress the installation package
- Initialize installation environment
- Perform installation
Architecture of openGauss
openGauss is a single-process multi-thread model of database providing choices for client drivers as
JDBC, ODBC and LIBpq.
The database operation takes the sequence below;
receive SQL request, creates thread, lexical analysis, syntax analysis, semantic analysis,
query rewriting, query optimization.
The memory structure of the database include share buffer, cstore, memory optimized tables (MOT),
write-ahead logging(WAL) buffer, work_mem and temp buffer.
Main threads of the database kernel includes GaussMaster,pagewriter, bgwriter, walwriter and checkpoint.
While openGauss is functionally close to postgreSQL, these are some major differences:
Deployment Solutions of openGauss
openGauss is responsive for various organization needs by supporting three modes of deployments.
a) Standalone deployment
In standalone the database is on a single node and the applications on a different machine/server. This mode of deployment is discouraged because it has satisfies reliability/availability requirements and both RTO and RPO are uncontrollable.
b) Primary/Standby deployment
This mode involves one primary and one standby. It is characterized with fault tolerance and RPO equal to 0s and RTO less than 10s. The primary and standby nodes are configured for maximum availability.
c) One primary and multiple standbys
This mode has 1 primary node and several standby nodes meaning instance faults can be withstood and
RTO less than 10s and RPO equal to zero. Used in scenarios where synchronization is needed.
One primary and two standbys is the most common because it enables three copies of data before faults
thereby providing 99.99% reliability.
Operation And Maintenance
openGauss can be easily installed following the earlier described installation process. It provides tools for interfacing with the database in two categories, client tools and server tools. Connection of a database is made by gsql command. The command can also be used for remote connection and installation of gsql client. The database also provides for system catalog which contains metadata used by openGauss to control system operations and system views providing way to query the catalog and internal database status. Finally, uninstallation of openGauss is done through the command gs_uninstall and gs_postinstall.